Home |
Analytical
Consulting |
Predictive Analytic
Services | Text Mining or Text Analytics
Text Mining or Text Analytics
The terms “text mining” and “text analytics” are often used interchangeably and refer to the extraction of data or information from
text. The text (words, sentences, paragraphs) could come from open-ended questions in a survey or CRM system, from customer complaints or comments, the entries
of salespeople, comments on a website, etc.
An array of techniques may be employed to derive meaning from text. The most accurate method is an intelligent, trained human being reading the text and
interpreting its meaning. This is the slowest method and the most costly, but the most accurate and powerful. Ideally, the reader is trained in qualitative
research techniques and understands the industry and contextual framework of the text. A well-trained qualitative researcher can extract extraordinary understanding
and insight from text. In a typical project, the qualitative researcher might read hundreds of paragraphs to analyze the text, develop hypotheses, draw conclusions,
and write a report. This type of analysis is subject to the risks of bias and misinterpretation on the part of the qualitative researcher, but these limitations
are with us always—regardless of method. The power of the human mind cannot be equaled by any software or any computer system. Decision Analyst’s
team of highly trained qualitative researchers are experts at understanding text.
The history of text analytics traces back to World War II and the development of “content analysis” by governmental intelligence services. That
is, intelligence analysts would read documents, magazines, records, dispatches, etc., and assign numeric codes to different topics, concepts, or ideas. By summing
up these numeric codes, the analyst could quantify the different concepts or ideas, and track them over time. This approach was further developed by the
survey research industry after the war. Today as then, open-end questions in surveys are analyzed by someone reading the textual answers and assigning numeric
codes. These codes are then summarized in tables, so that the analyst has a quantitative sense of what people are saying. This remains a powerful method
of text mining or text analytics. It leverages the power of the human mind to discern subtleties and context.
The first step is careful selection of a representative sample of respondents or responses. In surveys the sample is usually representative and comparatively
small (less than 2,000), so all open-ended questions are coded. However, in the case of a CRM system, or customer complaint system, the text might be made
up of millions of customer comments. So the first step is the random selection of a few thousand records, and these records are checked for duplicates, geographic
distribution, etc. Then, a human being reads each and every paragraph of text and assigns numeric codes to different meanings and ideas. These codes are tabulated
and statistical summaries are prepared for the analyst. This is text mining or text analytics at its apogee. Open-end coding offers the strength of numbers
(statistical significance) and the intelligence of the human mind. Decision Analyst operates a large multilanguage coding facility with highly trained staff
specifically for content analysis and text analytics.
Machine Text Mining or Text Analytics
With the explosion of keyboard-generated text related to the spread of PCs and the Internet over the past two decades, many companies are searching for
automated ways to analyze large volumes of textual data. Decision Analyst offers several text-analytic services, based on different software systems, to analyze
and report on textual data. These software systems are very powerful, but they cannot take the place of the thinking human brain. The results from these software
systems should be thought of as approximations, as crude indicators of truth and trends, but the results must always be verified by other methods and other data.
Text Mining Services
For more information on Text Mining or Text Analytics, please contact Rod Carver, Vice President, Director of Predictive Analytic Services (rcarver@decisionanalyst.com),
or call 1-800-ANALYSIS (262-5974) or 1-817-640-6166.
Additional Resources from Decision Analyst
Predictive Analytic Services
Advanced Analytic Brochures
Predictive Analytic Case Histories