What Data Analytics Can And Cannot Do

Lawyers are often thought of as a traditional bunch, but there are significant opportunities for those attorneys who embrace new concepts built on data use.

Man on charts Big dataThere is a lot of interest in Big Data, Business Intelligence, Predictive Analytics, and other data-related fields these days. Whether in distinctly non-legal areas like the Internet of Things or legal areas like jury selection, litigation finance, textual analytics, and hedge fund replication, techniques for using data are clearly changing many aspects of the business world.

This set of tools and techniques as a whole can be generically termed data analytics — and with major increases in computing power and software interfaces, 2017 may well be the biggest year yet for data analytics advances. Still for most novices in the field, there is a major misunderstanding around what data analytics can and cannot do.

To begin with, all data analytics processes start with a basic truism — garbage in, garbage out. If the data being analyzed is not accurate and representative of the world, then it’s not useful. This concept seems simple, but it is often forgotten. For instance, in a risk management function, people often think of data as being useful for extrapolating the likelihood of future events — but that is only true if we have data where the events we are worried about are actually occurring with the same frequency that they do in the world.

Take jury selection for example — we can use a statistical model called a probit model to figure out the probability of a particular juror making a decision at the end of the case. In order to model that effectively, we need to have data on the juror — age, sex, employment, background, etc.  Once we have that data, we can figure out the decision that juror is likely to come to given the facts of the case, and equally importantly, data analysis can tell us statistically how confident we are in that outcome. In other words, we might be 95% sure that juror XYZ would render a verdict of guilty, while we are only 63% sure that juror ABC would render such a verdict.

Yet in order to build this type of model, we need to have the right underlying data — that means having the right data on the juror, and having the right data on past cases that have been decided with other jurors and the data about those other jurors. In other words, building a data model requires investment of time and money — it is not a simple one-off process in many cases. Data analytics is powerful but only if we have the right tool for the job. Many industry insiders say that the single biggest problem that is holding back effective use of new data-related tools and technologies is the lack of data.

The second major issue with data analytics is that we need data which is properly cleaned and compiled. Most of the time the data used for analysis comes from different sources, some of which are high quality and others of which are low quality. That means that the datasets have to be cleaned and merged together into a single larger database. This is difficult and time consuming in many cases, especially with large datasets such as those used in investing.

For instance, when trying to replicate hedge funds, one needs to use data on hedge fund returns which come from one source, data on liquid futures and ETF returns which come from a second source, and data on characteristics of those ETFs which comes from a third source. The three sets of data have to all be merged together based on a single unifying factor like date of the returns. Once this is done, the data have to be cleaned to deal with issues like hedge funds that close up shop, or bid-ask bounce in ETF pricing. When you get done with this process, you have a formula that allows you to replicate the performance of any hedge fund category at a much lower cost — but again it requires time and investment to get accurate results.

Sponsored

Lawyers are often thought of as a traditional bunch, but there are significant opportunities for those attorneys who embrace new concepts built on data use.  The key to such efforts though is investing in new data analytics capacity as a process rather than thinking of it as a one-time effort.

Best of luck in the new year.

For those who have tried data analytics techniques with or without success in their practice, I welcome comments and feedback on how your efforts went.


Michael McDonald is an assistant professor of finance at Fairfield University in Connecticut. He holds a PhD in finance. Michael consults extensively with organizations ranging from Fortune 500 companies to start-up businesses on financial matters through Morning Investments Consulting. Michael has served as an expert witness in legal disputes, and is an arbitrator with the Financial Industry National Regulatory Authority (FINRA). Michael can be reached at M.McDonald@MorningInvestmentsCT.com.

Sponsored

CRM Banner