Big Data and its interpretation “Big-misunderstandings”
There are two interpretation misunderstandings when the subject is Big Data.
We immediately think about the literal meaning of the word, which is a “Big amount of information”. In fact, the concept is way broader, which brings us to the second misunderstanding. “Only the companies with a great volume of data should adopt these solutions.”
There are a few variations, but we rather use 5 V’s concept
As expected, the first V is for volume.
This one comes from the huge amount of content that is produced on social medias, blogs, websites, wikis, videos, audios, messages, etc. We are not talking about Gigabytes or Terabytes, but Zettabytes and Yottabytes and that has been growing fast. Just to give you an idea, a Zettabyte is the same as 1.000.000.000.000.000.000.000 bytes and Yottabytes has 3 more zeros to the right.
In the past, data was built by structured systems, meaning systems that had a traditional database, on which stored the data in an organized way, so it was easy to recover later. Now, the production comes from emails, presentations, videos, audios, social media, messengers, etc, and they are frequently not structured and not organized, which makes it harder to recover and analyze. An interesting example that is used to facilitate the recovery of data that most people use but don’t know what is for is the hashtag (#). It is used to compilate everything that refers to one determined subject.
Velocity can be translated in two ways:
One is the content production, each day faster. A post on social media generates comments in a matter of seconds.
The other is the possibility to analyze the data as fast as possible and also interact with your public. A good example is the applications on which some organizations and TV channels are using, as quoted before, the hashtag “#”.
Before we start using the data in our analysis, we must first certificate if they are authentic and if they also make sense in our analysis.
This is one of the main goals of the Big Data project. It is the equivalent to data cleaning, but we are talking about authenticating sources that we don’t have any control over, as it can be from an individual from any part of the world. Let’s use real life as an example, how many times do we see posts on social media or an email from a friend with information that we do not know if it is true? You must separate these kinds of information from the analysis, or it may cause many problems.
The use of data must generate value for the organization on which has implemented or uses Big Data. It doesn’t make sense if it doesn’t. In this case, value has a broader meaning, not only monetary but also the improvement of services and processes.
Big Data for all sizes
If we take the 5 V’s concept into consideration, Big Data can, and should, be utilized by all corporations. We are not talking about companies that work with a great volume of data, but with data from multiple sources, collecting with authenticity and in a way that generates value on which some cases, can be bulky and fast.
Most organizations analyze their own data, that is, data on which the company itself produces through its ERP, CPR systems. This is called BI(Business Intelligence), BA(Business Analytics), BC (Business Discovery) or any other initials on which may be recognizable in the market.
Big Data is a concept supported by methodologies and software that seek to bring value to the analysis by crossing and analyzing external data generated through many ways and entities (Formal and informal organizations, individuals, etc).
The importance is in the knowledge on which humanity seeks since the beginning and has evolved while using the set of information compiled with more efficiency. It is not different in the business, when more you know about the market, its applications, particularities, products, consumers and competitors, more competitive advantages you will have.
Everything that is new brings challenges. Big Data is a new and unexplored ground on which tends to be utilized on NoSQL databases. Since not so long ago, when someone talked about multidimensional databases on BI, the software and human resources were scarce. We still need software and human resources formation tools. We also do not have a large volume of experiences and material to support the studies and the implementation, but as usual, it is harder for the pioneers, but they can also achieve better results.
We are witnessing a lot of big companies and B2C implementing the Big Data project, but even with the increase in investments, there are still just a few of companies investing. The reason for this is because of cultural factors, low availability of human resources with the appropriate knowledge and investment capacity.
Seeing the number of organizations, our market is composed of a large amount of small and medium-sized organizations and a few large-sized organizations. In this scenario, a lot of medium-sized organizations are starting now they first experiences with BI, BA, or BD. In many cases, small organizations don’t have an ERP, so they just use software that is necessary to perform the tax obligations. Even with an increase in investments, we are still talking about a select group of organizations which have already seen the results in the investment because they have been through previous initiatives. That’s the reason why the divulgation is still small.
Like any other situation in the market, scale lowers prices and make resources available. Certainly, with the success of these projects, we will have an exponential increase in Big Data applications.