Image of cellular data

Part 2: Is There an App for This? Reviewing and Producing Mobile Data

In Part 1 of our series on mobile devices, we discussed preserving and collecting mobile device data. In this article, we turn to the types of information you can expect to encounter with mobile devices and key considerations for analyzing, reviewing and producing these types of data.

To start, you must acquire data from the mobile devices. As you begin this process, you also should start examining the data you are acquiring, in particular data about communications, custodian device activity, and device files. Once you have acquired data, you will want to determine which data to use and then export data so you can work with it more readily – reducing, analyzing, reviewing, producing, and presenting it as you would any other ESI. There are a limited but, fortunately, growing number of options for working with the exported data. They include using spreadsheet reports, loading data into review platforms and working with it there, and, to a limited extent, working with the data presented in a near-native format.

Mobile device data you can acquirecellebrite screenshot

Mobile device content can be acquired using a variety of tools such as Cellebrite, Oxygen Forensics, and BlackBag. U such tools, you have access to a broad array of types of information, as indicated in the Cellebrite screenshot in Figure 1. These can be grouped into three broad categories of commonly used information types – remembering that there are many other types available as well.

The first category of information types is communication data. This category includes chat messages, instant messages, SMS/MMS messages, call logs, and voicemails. It contains the most commonly reviewed and produced mobile device data.

The second category is custodian device activity. It comprises such information as calendar entries, device logs, web histories, user accounts, and contacts.

The last category is device files. These are files that are physically stored on the device, such as stored media files and downloaded attachments.

Determining which data to use

Determining which mobile device data you want to work with is an exercise you may have to go through anew for each matter and each device.

The wide range of data available from mobile devices requires that you carefully select which types you intend to consider. It also means you should be mindful about how you intend to analyze, review, produce, and present that content.

The nature of your matter might be such that you select only a subset of this information for review; location content, for example, might be the only data that you care about. If, however, you need to examine the full array of communications available from or through a mobile device, then you will need to be prepared to look at content from all applications on that device and develop a process for assessing each application’s data. Chances are, your needs will fall somewhere between these two extremes.

Exporting mobile device data for analysis and review

Having determined which mobile device data you want to focus on, you need to export that data to work with it further. Several common export formats are used for exporting mobile device data. The two most common formats are report-style spreadsheets and SQLite databases. Both provide virtually the same information. How they present the information, and what they let you do with it, differs.

Spreadsheet reports provide more immediate access to data from individual records, such as chat messages, call logs, and text messages. SQLite databases offer greater power but require knowledge of database structure and SQLite tools that enable the reviewer to query and review the data. Right now, report-style spreadsheets are in vogue, as many find them both easier to use and more flexible.

Analyzing and reviewing mobile device data

Once you have figured out which mobile data you want to review and analyze, the question quickly arises – how do you do this?

A variety of approaches can be used to review and produce mobile data. The workflows and solutions we describe here are general observations of current industry practices. The techniques you choose to employ in one case may not fit the needs of another, so you always should be ready to consult a technology professional about the best options for a particular matter if you find yourself in unfamiliar territory.

Some mobile device data review challenges

Mobile device data present challenges that are important to understand before you start digging into the data to analyze and review it.

Missing metadata. The first challenge is that of missing metadata. When records are deleted from a mobile device, significant parts of the metadata for those records will be gone. Similarly, there can be issues of missing metadata for mobile device applications that do not generate or retain standard metadata, such as record creation dates.

Encryption. Another major challenge is data for encrypted applications. Some applications encrypt messages and attachments, which can render the data inaccessible for most mobile device acquisition tools.

Identities. Another set of challenges arises from the nature of how mobile devices track the identities of people with whom the devices’ users communicate. When mobile device data is collected, some messages show a sender or recipient’s chat ID or phone number instead of the person’s name. Depending on the chat ID or the volume of different phone numbers, this lack of an obvious identifier makes it difficult to determine identities of communicating parties. Sometimes this can be corrected by matching address book entries to those communications, but even then an address book may contain partial or conflicting information. Large group chat message groups can compound this challenge, as multiple parties might have been communicating both with each other and across one another.

Using spreadsheet reports

A common approach to analyzing and reviewing mobile data is to work with a spreadsheet-based report generated from the device’s forensic image or from an extract of the device’s SQLite database (see Figure 2). Often the review of the spreadsheet report is performed by a single person. That person goes through the rows of information in the spreadsheet to determine which rows to produce. After production decisions have been made, a new copy of the spreadsheet is created. The new copy contains only those rows of information designed to be produced. That copy then is produced to the other party.

Although this approach is simple on its face, is has several disadvantages—especially in matters with multiple mobile devices. One problem is that those using this approach typically do not apply document control numbers to conversations and other data within the report. The lack of control numbers can lead to confusion among the parties about which communications they are referring to and can mean extra time spent confirming that everyone is referring to the same information. An additional problem is that using these spreadsheets means generally working outside of a centralized platform. That makes it more difficult to accomplish a host of tasks that are not just logistical but ultimately substantively impactful on the work performed, such as tracking decisions about the device data, sharing work product, and providing team members with access to both the device data and information about how that data was treated.

Using review platforms

An alternative approach is to reorganize the content of the reports so that it can be loaded into a review platform. This often means starting with individual spreadsheets, generating a review database record for each record reflected in each spreadsheet, assigning a control number to each record, recording parent-child relationships, and so on. It also means creating one or more load files (see Figure 3) so that content can be loaded to the review platform in a way that allows it to be useable. The load files, records, and any native files from the device then are loaded into the review platform. To facilitate the ability of the review team to work with the device data now available in the review platform, typically someone with appropriate expertise creates customized layouts and views for the device data and adds tags that can be used to track relevance, privilege and other decisions made about the content.

A challenge with this approach is that each piece of information from each report is treated as a separate document. This means every individual text or chat is represented as a separate document in the database. As a result, understanding context, determining which information to produce, and deciding how to produce it can become complicated.

A near-native approach

Recognizing the need for effective review of mobile data in centralized review platforms, several providers have been developing processes to format chat, SMS, and MMS data in a “near-native” format. Figure 4 shows an example from Relativity. The resulting document looks more like a standard chat conversation, allowing reviewers to read it more easily and better place it in context.

This emerging approach requires that thought be given to formatting issues. You will need to make decisions about what constitute appropriate “document breaks.” Should a full day of chats be defined as a single document, for example, or would another time period make sense? You also will need to consider the possibility that the resulting documents may not contain all necessary context to assess the information.

Additional challenges

Regardless how you format mobile device data for review, you likely will face other data management challenges, especially when you need to work with data from multiple devices.  Currently there is no consistently effective way of deduplicating mobile data, either within or across devices. Searching mobile data also can be more difficult because it is comprised of many different metadata fields that are not commonly present in e-discovery. You may need to build custom search indexes to ensure that your search is reasonably complete.

Production of mobile device data

Many of the questions surrounding review of mobile device data also apply to production. Producing mobile device data is unique because the data records themselves are not true native files. Mobile device data is a collection of metadata—and possibly attachments. While the attachments can be viewed natively, the mobile device data records are not native files. In order to produce them in a near-native format, you must convert the data into a new format, and as with review, several different options are available.

Currently, the two primary approaches for production are the same as in review: produce a spreadsheet with the metadata and any accompanying native file attachments or produce near-native format files with accompanying load files and native attachment files. The decision about which approach to use depends on the negotiations between both parties and relative costs for each method. Producing in spreadsheets is a cheaper and faster solution, but in matters involving numerous mobile devices, managing numerous spreadsheets can be unwieldly without the controls that can be put in place with a database-centered approach with near-native files.

Regardless of the production method that you choose, be mindful of several challenges for any production. A particularly notable challenge is redaction. If you are redacting parts of or entire messages, you need to make sure the message is truly redacted in both the metadata and near-native files. While there are innovative solutions for this in the market, no standard approach or set of best practices has been established. This means that any production guidelines should be carefully defined by both parties and that a thorough review of the production should be performed.

Conclusion

Options for getting mobile device data ready to analyze, review, produce, and ultimately present still are limited. Although tools and techniques continue to improve, at this stage a significant amount of manual and/or custom work often can be needed to get from that data the value you are seeking. As we gaze into out murky crystal balls, however, we see more than a little hope for the future of mobile device data review and production.

In our next post, we will go beyond review (at least in the traditional sense), to discuss strategies for managing non-communication mobile device data (e.g., web browser history or address books) and performing analytics of mobile device content.

About the Author

George Socha on Email
George Socha
An award-winning attorney, expert witness, and consultant and co-founder of EDRM, George Socha has worked with corporate, law firm,governmental,and service and software provider clients on e-discovery and litigation support challenges and opportunities for over three decades.As a practicing lawyer, an independent consultant, and at a large service provider, he has been active in just about every aspect of e-discovery: directing activities from early identification of potentially pertinent ESI through analyses of produced materials; designing and supervising implementation of TAR and other protocols; presenting content at depositions,hearings,arbitrations, and trials; and testifying on issues ranging from spoliation to pricing to state of the practice.From 2003 to 2009, George and Tom Gelbmann conducted the Socha-Gelbmann Electronic Discovery Survey. In 2005, they founded EDRM, which creates practical global resources to improve e-discovery, privacy, security,and information governance.
Martha Louks on Email
Martha Louks
Director of Technology Services at McDermott Will & Emery LLP
Martha Louks focuses on implementing high-value, efficiency-driven solutions that improve the delivery of legal services for clients. Martha collaborates with legal teams to develop technology and workflow approaches that align with case strategy, reduce costs and improve efficiency. Martha has extensive experience developing defensible processes that center on the tailored use of technology and experienced professionals to achieve results for our clients. She evaluates the efficacy of a wide range of technologies to determine which tools are best suited to a matter’s unique needs. Martha also oversees technology and provides consulting services for McDermott Discovery. She prepares discovery preservation plans capable of withstanding intense scrutiny while simultaneously addressing the flexibility necessary for clients to meet their business obligations. She advises on electronic evidence considerations, working closely with legal teams to incorporate the results of forensic investigation into legal analysis. Martha is also highly skilled in using artificial intelligence for Technology Assisted Review and investigations, consulting on the defensibility and effectiveness of varied workflow approaches. Martha is a Certified Relativity Expert, having three concurrent certifications: Relativity Certified Administrator, Relativity Analytics Specialist and Relativity Assisted Review Specialist.
Joe Sremack on Email
Joe Sremack
Director, Data Analytics & Software Robotics at BDO
Joe Sremack is a director in BDO’s Data Analytics & Software Robotics practice. His primary focus is developing and implementing strategies and technologies to assist corporate and legal clients in matters involving complex technology issues and investigations. Joe has deep knowledge of structured data collection and analysis, information technology (IT) assessments, electronic discovery, and software analysis. A computer scientist by training, Joe has conducted numerous investigations and assessments—including the three largest Ponzi schemes in history—involving systems investigations, data analysis, source code analysis, data compliance assessments and the evaluation of technology solutions. He has assisted clients across the U.S. and internationally in such matters as financial crime investigations, regulatory compliance assessments, intellectual property theft investigations and antitrust disputes. Joe has worked with clients across a wide range of data-intensive industries, including healthcare, finance, technology, and energy. He has also served clients in the hospitality, non-profit and telecommunications sectors. Joe frequently presents and writes on topics involving transactional data systems and is the author of Big Data Forensics, a technical guide on performing investigations of large-scale, clustered data systems. Prior to joining BDO, Joe held leadership positions at several expert service and advisory consulting firms.