Bernard Marr at LinkedIn Pulse: “The field of Big Data requires more clarity and I am a big fan of simple explanations. This is why I have attempted to provide simple explanations for some of the most important technologies and terms you will come across if you’re looking at getting into big data.
Here they are:
Algorithm: A mathematical formula or statistical process run by software to perform an analysis of data. It usually consists of multiple calculations steps and can be used to automatically process data or solve problems.
Amazon Web Services: A collection of cloud computing services offered by Amazon to help businesses carry out large scale computing operations (such as big data projects) without having to invest in their own server farms and data storage warehouses. Essentially, Storage space, processing power and software operations are rented rather than having to be bought and installed from scratch.
Analytics: The process of collecting, processing and analyzing data to generate insights that inform fact-based decision-making. In many cases it involves software-based analysis using algorithms. For more, have a look at my post: What the Heck is… Analytics
Big Table: Google’s proprietary data storage system, which it uses to host, among other things its Gmail, Google Earth and Youtube services. It is also made available for public use through the Google App Engine.
Biometrics: Using technology and analytics to identify people by one or more of their physical traits, such as face recognition, iris recognition, fingerprint recognition, etc. For more, see my post: Big Data and Biometrics
Cassandra: A popular open source database management system managed by The Apache Software Foundation that has been designed to handle large volumes of data across distributed servers.
Cloud: Cloud computing, or computing “in the cloud”, simply means software or data running on remote servers, rather than locally. Data stored “in the cloud” is typically accessible over the internet, wherever in the world the owner of that data might be. For more, check out my post: What The Heck is… The Cloud?
Distributed File System: Data storage system designed to store large volumes of data across multiple storage devices (often cloud based commodity servers), to decrease the cost and complexity of storing large amounts of data.