Archive for December, 2012
The Market Research Tortoise and the Big Data Hare … or a Misguided Analogy
Posted by Chris in Big Data Leadership, Intelligent Analytics Leadership, Market Research Leadership on December 31, 2012
If you are in market research (MR), you may debate the which is tortoise and which is hare, but you probably do see a race and competition for the prize. Or, more realistically, you may fear that others see it as a competition. The prize may be budget, recognition as the source of valuable customer insights, or winning jobs. This may be a limiting perspective. If you are in the Big Data space could see it as a race but MR is probably not on your radar or you consider the outcome a foregone conclusion. You probably believe that MR could be a valuable appendage but ultimately discount MR when viewing the main role and impact of Big Data. There may be some blind spots. However, the two areas do overlap and treating it emotionally or functionally as a competitive situation will be counter-productive and lead to lost opportunity.
Some say that focus group and survey-based market research has reached the end of its life-cycle and will be replaced by Big Data analytics. Those who fully promote this position usually have an agenda to push or they are not well informed about market research. However, it does seem that these two areas of market intelligence, customer insight, and decision support both offer valuable and maybe even critical decision support. The million, or billion, dollar question is how to integrate them and leverage them together. Will this integration be the basis for the next round of killer market research applications or the means of getting the most value from your big data investments? My thoughts in this area were triggered by the linked article that contains some comments, including one application that combines the two areas of MR and Big Data.
In both areas, those who wish to capitalize the most will figure out how to combine them to leverage value. In Big Data, folks will come to realize that there are many holes in the data, many missing links, many areas where current or past data will not suffice in predicting the future, and many areas where the value of the insights can be magnified through the leaven of custom MR.
On the other hand, MR folks either have or will come to recognize that the in-context reality and pure organic validity of Big Data adds immensely to the precision inquiry possible in MR. Big Data integration will increase MR value by bridge limitations or biases created by interviewer/researcher subjectivity, sampling frames, respondent behavioral self-report inaccuracies, perceived insights credibility, perception of frequent lack of tangible real-world action-ability, to name a few.
Seeing or specifically identifying the areas of combined value will be the first and most important step to making this happen. The next step, and arguably the more difficult one, is to actually combine the functions related to Big Data and MR to bring about a profitable union. Most corporations have these functions silo-ed separately, based on their historical origins and historically separate functions, skill-sets and processes. In fact it goes even deeper in that these two areas may compete for the same budget and in fact may feel quite competitive with each other for recognition in the organization. Executives not only have an organizational issue to resolve but likely an emotional one as well.
On the supplier side the difficulties are multiple for being able to combine these areas. First is that most MR firms do not have an understanding of or a captive skill set that encompasses Big Data analytics. They don’t have the mindset for what it is, how to get it, and what to do with it. Next, most emerging Big Data firms come from a programming, IT, data mining, or highly technical area of strength. They do not have the insights background or training and skills. The foundations of marketing strategy and consumer behavior and psychology is lighter than it needs to be. The softer, more granular or individually based understanding of motivations, attitudes, drivers, etc. and how they mix together to generate action and commitment are usually missing with folks who have been devoted to algorithmic programming or econometric modeling and mining.
The future in this area belongs to people and organization who can bridge these domains. This is a process and the sooner it begins the better and the more likely you can capitalize on it for competitive advantage. Identify the needs, find the relevant gaps in addressing the needs, prioritize and act on filling the holes and showing tangible progress through a series of quick wins.
To motivate these changes, we need to see more examples of how the two domains combine for added value. The referenced article (noted below) provides one lightly sprinkled example. I believe we will begin seeing more.
How will it play out? Will Big Data analytics be the hare? Will the hare win? Maybe fundamentally, is there a race? Maybe, we see the races either being set up or already in process and the question is how do we change the nature of the race so that instead of competitors, the tortoise and the hare are teammates?
Referenced article: http://smartdatacollective.com/sandrosaitta/94181/guest-post-mark-zielinski
“The general consensus seems to be that “big data” is never going to replace traditional research – that is, specific research methods like surveys and focus groups that deal with particular topics will always be around. These specific research methods answer the question of “what” – that is, they are concerned with empirical details. For example, an online survey may indicate that compared to 2011, this year 5% of soda drinkers no longer drink Coca-Cola on a regular basis. Where “big data” aims to change research is in the “why” – that is, the broad trends and underlying reasons why certain results have been obtained. Using our previous example, “big data” may be able to tell us that key nutritional influencers have recently been saying that carbonated sugary drinks reduce life expectancy by an average of 4 years in healthy individuals. By having both the “what” results and the “why” results, researchers can use this combination of data to have a much clearer picture of a particular situation, and potentially be able to advise their clients on how to act to obtain the results they wish to achieve.”
Visionary but Naive: We don’t need more data scientists — just make big data easier to use — Tech News and Analysis
Posted by Chris in Intelligent Analytics Leadership on December 26, 2012
Simple Big Data Framework but Main Article Perhaps Naive: “We don’t need more data scientists — just make big data easier to use” — Tech News and Analysis
Clipped from: http://gigaom.com/2012/12/22/we-dont-need-more-data-scientists-just-simpler-ways-to-use-big-data/
As we try to break down the big task of tackling a Big Data strategy, having a framework greatly facilitates the process. A few blog posts back, I outlined one framework to use for attacking this area. This article contains another structure, somewhat unique. However, the author is not really applying the structure to this issue but rather as a way to support the argument for more software and less data scientists.
The article borders between visionary and naive. I agree that its inevitable that applications will be built to address more and more specific data use cases. However, the evidence is overwhelming that in the short run it is nearly negligent to advocate fewer data scientists or a smaller role for them. We are at the early stage of the Big Data cycle where data and the application contexts are ill defined. This is a period where grand promises can be made and then will likely be broken once implementation needs to happen in a justifiable way.
This does not mean that every company needs to hire their own data scientist — but rather that they at least have access to those kinds of skills in a consulting format. The talent is scarce for hiring but not as scarce for hiring as a consultant. This probably is the best approach for most firms at this stage.
- Dec 22, 2012 – 12:00PM PT
We don’t need more data scientists — just make big data easier to use
- By Scott Brave, Baynote
- 29 Comments
Sure, more data scientists would be great. But Scott Brave, of Baynote, says the better solution is to create analytics products that are so easy to use that you don’t even need a data scientist.
photo: Sergey Nivens/Shutterstock.com
Virtually any article today about big data inevitably turns to the notion that the country is suffering from a crucial shortage of data scientists. A much-talked-about 2011 McKinsey & Co. survey pointed out that many organizations lack both the skilled personnel needed to mine big data for insights and the structures and incentives required to use big data to make informed decisions and act on them.
What seems to be missing from all of these discussions, though, is a dialogue about how to steer around this bottleneck and make big data directly accessible to business leaders. We have done it before in the software industry, and we can do it again.
To accomplish this goal, it’s helpful to understand the data scientist’s role in big data. Currently, big data is a melting pot of distributed data architectures and tools like Hadoop, NoSQL, Hive and R. In this highly technical environment, data scientists serve as the gatekeepers and mediators between these systems and the people who run the business – the domain experts.
While difficult to generalize, there are three main roles served by the data scientist: data architecture, machine learning, and analytics. While these roles are important, the fact is that not every company actually needs a highly specialized data team of the sort you’d find at Google or Facebook. The solution then lies in creating fit-to-purpose products and solutions that abstract away as much of the technical complexity as possible, so that the power of big data can be put into the hands of business users.
By way of example, think back to the web content management revolution at the turn of the century. Websites were all the rage, but the domain experts were continually banging their heads against the wall – we had an IT bottleneck. Every new piece of content had to be scheduled and sometimes hard-coded by the IT elite. So how was it resolved? We generalized and abstracted the basic needs into web content management systems and made them easy for non-techies to use. As long as you didn’t need anything too crazy, the problem was solved easily, and the bottleneck averted.
Let’s dig a little deeper into the three main roles of today’s data scientist, using online commerce as a backdrop.
The key to reducing complexity is to limit scope. Nearly every ecommerce business is interested in capturing user behavior – engagements, purchases, offline transactions and social data – and almost every one of them has a catalog and customer profiles.
Limiting scope to this basic functionality would allow us to create templates for the standard data inputs, making both data capture and connecting the pipes much simpler. We’d also need to find meaningful ways to package the different data architectures and tools, which currently include Hadoop, Hbase, Hive, Pig, Cassandra and Mahout. These packages should be fit for purpose. It comes down to the 80/20 rule: 80 percent of big data use cases (which is all most ecommerce businesses need), can be achieved with 20 percent of the effort and technology.
Surely we need data scientists in machine learning, right? Well, if you have very customized needs, perhaps. But most of the standard challenges that require big data, like recommendation engines and personalization systems, can be abstracted out. For example, a large part of the job of a data scientist is crafting “features,” which are meaningful combinations of input data that make machine learning effective. As much as we’d like to think that all data scientists have to do is plug data into the machine and hit “go,” the reality is people need to help the machine by giving it useful ways of looking at the world.
On a per domain basis, however, feature creation could be templatized, too. Every commerce site has a notion of buy flow and user segmentation, for example. What if domain experts could directly encode their ideas and representations of their domains into the system, bypassing the data scientists as middleman and translator?
It’s never easy to automatically surface the most valuable insights from data. There are ways to provide domain-specific lenses, however, that allow business experts to experiment – much like a data scientist. This seems to be the easiest problem to solve, as there are a variety of domain-specific analytics products already on the market.
But these products are still more constrained and less accessible to domain experts than they could be. There is definitely room for a friendlier interface. We also need to take into consideration how the machine learns from the results that analytics deliver. This is the critical feedback loop, and business experts want to provide modifications into that loop. This is another opportunity to provide a templatized interface.
As we learned in the CMS space, these solutions won’t solve every problem every time. But applying a technology solution to the broader set of data issues will relieve the data scientist bottleneck. Once domain experts are able to work directly with machine learning systems, we may enter a new age of big data where we learn from each other. Maybe then, big data will actually solve more problems than it creates.
Scott Brave is co-founder and CTO of Baynote, an e-tail and e-commerce advisory business. He is also an editor of the “International Journal of Human-Computer Studies” (Amsterdam: Elsevier) and co-author of “Wired for speech: How voice activates and advances the human-computer relationship” (Cambridge, MA: MIT Press).
Photo courtesy of Sergey Nivens/Shutterstock.com/2012
Garnter Report Shows Huge Big Data Needs: May Seem Overwhelming, Doesn’t Need to Be
Posted by Chris in Big Data Leadership, Intelligent Analytics Leadership on December 17, 2012
Big Data Report: Computer News Middle East Article.
Its an exciting time. But it shouldn’t be intimidating nor paralyzing. Through a series of reports, Gartner and others continue to highlight the business impact of the growing big data phenomena. Leaders should use these kinds of articles as a call to action to examine their own companies and set in motion the processes to ensure they are best positioned for the coming years. For many firms this means outsourcing the necessary resources as the company moves along the learning curve, scopes out the opportunity and generates success through a series of pilot efforts.
Important Q: If not the CIO, Who? at the
Posted by Chris in Big Data Leadership, Intelligent Analytics Leadership on December 12, 2012
Important Q: If not the CIO, Who? at the end of article: Social Analytics Isn’t Just For Social Networks | Big Data http://ow.ly/g1D6a
Openshaw, the author, asks a very important question and then in his answer raises significant issues around the CIO role and ownership when it comes to motivating adoption of insights analytics applications. What should we be requiring of the CIO?
The excerpt reads:
“If Not The CIO, Who?
IT leaders have an untapped asset. As the custodian of the company’s data, it’s the CIO’s job to tap into it.
Start with a one-time deep dive into the company’s social data. See that data warehouse? Dive in. What are you looking for? Patterns of interaction. Make observations about how these patterns influence performance and engage with other business leaders about these insights and about how the patterns identified in social data could be used to create truly predictive indicators.”
It appears that Openshaw is suggesting that the CIO take responsibility for and ownership of actually diving into the data to conduct the analyses and find the insights. That is placing a lot of expectation on the CIO and his team. But I wonder if the CIO should be part of the process, understanding the needs of the other stakeholders to make sure that the infrastructure can support the needed applications with a focus the CIO team’s strengths, while allowing domain experts in marketing or operations to actually describe the needs with the data and do the analyses, in line with their strengths. There is a strong argument that to be most effective, those in the domains of using the insights be the ones to search for the patterns and work with the CIO’s team to have the ability to do that.
Openshaw says that “collectors of data must learn which questions to ask and which hypotheses to test.” This learning should probably take place from others in the organization who own the development of these questions and hypotheses for their areas of responsibility in the organization.
The CIO is probably in the best position to coordinate the stakeholders. The CIO can bring them together in a common cause of data management and analytics tools investment needs, and inspire them to learn what their needs are and to plan out their analytics applications’ agendas. The CIO is in a unique position to have the credibility and the appropriate business interest and political context to facilitate these kinds of vital areas of collaboration.
Good case study showing 10x time savings
Posted by Chris in Big Data Leadership, Intelligent Analytics Leadership on December 11, 2012
Good case study showing 10x time savings w/Big Data methods: Why Sears Is Going All-In On Hadoop – Global-cio – Executive http://ow.ly/g1xYS
In the article the CTO gives some good, but potentially contradictory advice about making these kinds of changes.
“You have to go fast and be bold without taking stupid risks,” Shelley says. Start with a business need “that causes enough pain that people will notice and they’ll see tangible benefits.”
Shelley, by saying to be fast and bold is not saying to be irrational or recommending taking unnecessary risk. But, I think he is saying to be a leader, create a vision of what can be done, create a plan for application that is bite sized and can create quick value. That allows it to be quick. The impact it promises is bold.
No Knee-Jerk Reactions: A Reasoned Approach to a “Big Data Strategy”
Posted by Chris in Big Data Leadership, Intelligent Analytics Leadership on December 6, 2012
You have probably seen highly charged atmospheres in companies where the executive team has decided that the company must have a Big Data strategy. Those tasked with then putting it together feel this pressure to prepare an impressive plan that represents a big leap and a lot of investment. It’s similar to what happened in the 90’s with CRM and then what happened right around the 2000 mark with the Internet. Faced with the rising wave of interest and investor questions, companies risk making knee-jerk reactions which all too often result in corporate and personal disaster. Big Data is on the same trajectory. We see it in our everyday conversations with others, in the press and promotional arena. It has also been shown in the Gartner “Hype Cycle” for 2012 — http://www.infoq.com/news/2012/08/Gartner-Hype-Cycle-2012.
Leveraging Big Data in your company does not have to be mysterious, intimidating or expensive. There are different ways to approach the elephant — and as the adage goes, maybe the best way it to take it one piece at a time to digest it properly and align it within the organization.
One approach I’d recommend is doing what I call the 3-V Application Value analysis. This is where you assess the specific Big Data that you have access to and then look at the differences that Big Data offers from what data, analyses and resulting applications you currently use. Do this by each of the V’s that define Big Data: Velocity, Volume and Variety. This leads to an opportunities and costs analysis that will then be the basis of a plan of action. This is a reasoned approach to getting the best value out of your investment in Big Data.
For instance, let’s take Velocity. What is it that is different about Big Data because of Velocity? And, when looking at the form of Big Data you have access to, what does that imply for the applications you could build? A very high level assessment is where you would start and it may look something like this:
- Opportunities. The opportunities that come from high velocity data include the development of real time or more immediately updating applications. These might be
- New and more relevant executive dashboards
- Tools that allow you to make adjustments to engagement campaigns while they are executing
- Development of individualized recommendation systems
- Quickly identifying product quality issues
- Better capitalizing on unforeseen benefits or uses of your product or service
- Costs. The costs of taking advantage of these opportunities would be driven by a number of factors, including:
- Instituting new layers of data connectivity
- Building machine learning and continual statistical processes layers
- Designing and implementing real time reporting and simulation tools
The benefit of this kind of approach is the creation of a rational framework for advocating specific kinds of Big Data investment. A team can examine the detailed differences between existing data being used and Big Data, link those to potential new analyses or applications, and tie them to specific investments. The contrasting and incremental nature of this approach takes the mystery out of Big Data by relating it to what you have experience with and providing a stepping-stone approach that builds on strengths and ensures investments will be made with confidence and less risk.