In Part 1 of our series on mobile devices, we discussed preserving and collecting mobile device data. In this article, we turn to the types of information you can expect to encounter with mobile devices and key considerations for analyzing, reviewing and producing these types of data.
To start, you must acquire data from the mobile devices. As you begin this process, you also should start examining the data you are acquiring, in particular data about communications, custodian device activity, and device files. Once you have acquired data, you will want to determine which data to use and then export data so you can work with it more readily – reducing, analyzing, reviewing, producing, and presenting it as you would any other ESI. There are a limited but, fortunately, growing number of options for working with the exported data. They include using spreadsheet reports, loading data into review platforms and working with it there, and, to a limited extent, working with the data presented in a near-native format.
Mobile device data you can acquire
Mobile device content can be acquired using a variety of tools such as Cellebrite, Oxygen Forensics, and BlackBag. U such tools, you have access to a broad array of types of information, as indicated in the Cellebrite screenshot in Figure 1. These can be grouped into three broad categories of commonly used information types – remembering that there are many other types available as well.
The first category of information types is communication data. This category includes chat messages, instant messages, SMS/MMS messages, call logs, and voicemails. It contains the most commonly reviewed and produced mobile device data.
The second category is custodian device activity. It comprises such information as calendar entries, device logs, web histories, user accounts, and contacts.
The last category is device files. These are files that are physically stored on the device, such as stored media files and downloaded attachments.
Determining which data to use
Determining which mobile device data you want to work with is an exercise you may have to go through anew for each matter and each device.
The wide range of data available from mobile devices requires that you carefully select which types you intend to consider. It also means you should be mindful about how you intend to analyze, review, produce, and present that content.
The nature of your matter might be such that you select only a subset of this information for review; location content, for example, might be the only data that you care about. If, however, you need to examine the full array of communications available from or through a mobile device, then you will need to be prepared to look at content from all applications on that device and develop a process for assessing each application’s data. Chances are, your needs will fall somewhere between these two extremes.
Exporting mobile device data for analysis and review
Having determined which mobile device data you want to focus on, you need to export that data to work with it further. Several common export formats are used for exporting mobile device data. The two most common formats are report-style spreadsheets and SQLite databases. Both provide virtually the same information. How they present the information, and what they let you do with it, differs.
Spreadsheet reports provide more immediate access to data from individual records, such as chat messages, call logs, and text messages. SQLite databases offer greater power but require knowledge of database structure and SQLite tools that enable the reviewer to query and review the data. Right now, report-style spreadsheets are in vogue, as many find them both easier to use and more flexible.
Analyzing and reviewing mobile device data
Once you have figured out which mobile data you want to review and analyze, the question quickly arises – how do you do this?
A variety of approaches can be used to review and produce mobile data. The workflows and solutions we describe here are general observations of current industry practices. The techniques you choose to employ in one case may not fit the needs of another, so you always should be ready to consult a technology professional about the best options for a particular matter if you find yourself in unfamiliar territory.
Some mobile device data review challenges
Mobile device data present challenges that are important to understand before you start digging into the data to analyze and review it.
Missing metadata. The first challenge is that of missing metadata. When records are deleted from a mobile device, significant parts of the metadata for those records will be gone. Similarly, there can be issues of missing metadata for mobile device applications that do not generate or retain standard metadata, such as record creation dates.
Encryption. Another major challenge is data for encrypted applications. Some applications encrypt messages and attachments, which can render the data inaccessible for most mobile device acquisition tools.
Identities. Another set of challenges arises from the nature of how mobile devices track the identities of people with whom the devices’ users communicate. When mobile device data is collected, some messages show a sender or recipient’s chat ID or phone number instead of the person’s name. Depending on the chat ID or the volume of different phone numbers, this lack of an obvious identifier makes it difficult to determine identities of communicating parties. Sometimes this can be corrected by matching address book entries to those communications, but even then an address book may contain partial or conflicting information. Large group chat message groups can compound this challenge, as multiple parties might have been communicating both with each other and across one another.
Using spreadsheet reports
A common approach to analyzing and reviewing mobile data is to work with a spreadsheet-based report generated from the device’s forensic image or from an extract of the device’s SQLite database (see Figure 2). Often the review of the spreadsheet report is performed by a single person. That person goes through the rows of information in the spreadsheet to determine which rows to produce. After production decisions have been made, a new copy of the spreadsheet is created. The new copy contains only those rows of information designed to be produced. That copy then is produced to the other party.
Although this approach is simple on its face, is has several disadvantages—especially in matters with multiple mobile devices. One problem is that those using this approach typically do not apply document control numbers to conversations and other data within the report. The lack of control numbers can lead to confusion among the parties about which communications they are referring to and can mean extra time spent confirming that everyone is referring to the same information. An additional problem is that using these spreadsheets means generally working outside of a centralized platform. That makes it more difficult to accomplish a host of tasks that are not just logistical but ultimately substantively impactful on the work performed, such as tracking decisions about the device data, sharing work product, and providing team members with access to both the device data and information about how that data was treated.
Using review platforms
An alternative approach is to reorganize the content of the reports so that it can be loaded into a review platform. This often means starting with individual spreadsheets, generating a review database record for each record reflected in each spreadsheet, assigning a control number to each record, recording parent-child relationships, and so on. It also means creating one or more load files (see Figure 3) so that content can be loaded to the review platform in a way that allows it to be useable. The load files, records, and any native files from the device then are loaded into the review platform. To facilitate the ability of the review team to work with the device data now available in the review platform, typically someone with appropriate expertise creates customized layouts and views for the device data and adds tags that can be used to track relevance, privilege and other decisions made about the content.
A challenge with this approach is that each piece of information from each report is treated as a separate document. This means every individual text or chat is represented as a separate document in the database. As a result, understanding context, determining which information to produce, and deciding how to produce it can become complicated.
A near-native approach
Recognizing the need for effective review of mobile data in centralized review platforms, several providers have been developing processes to format chat, SMS, and MMS data in a “near-native” format. Figure 4 shows an example from Relativity. The resulting document looks more like a standard chat conversation, allowing reviewers to read it more easily and better place it in context.
This emerging approach requires that thought be given to formatting issues. You will need to make decisions about what constitute appropriate “document breaks.” Should a full day of chats be defined as a single document, for example, or would another time period make sense? You also will need to consider the possibility that the resulting documents may not contain all necessary context to assess the information.
Additional challenges
Regardless how you format mobile device data for review, you likely will face other data management challenges, especially when you need to work with data from multiple devices. Currently there is no consistently effective way of deduplicating mobile data, either within or across devices. Searching mobile data also can be more difficult because it is comprised of many different metadata fields that are not commonly present in e-discovery. You may need to build custom search indexes to ensure that your search is reasonably complete.
Production of mobile device data
Many of the questions surrounding review of mobile device data also apply to production. Producing mobile device data is unique because the data records themselves are not true native files. Mobile device data is a collection of metadata—and possibly attachments. While the attachments can be viewed natively, the mobile device data records are not native files. In order to produce them in a near-native format, you must convert the data into a new format, and as with review, several different options are available.
Currently, the two primary approaches for production are the same as in review: produce a spreadsheet with the metadata and any accompanying native file attachments or produce near-native format files with accompanying load files and native attachment files. The decision about which approach to use depends on the negotiations between both parties and relative costs for each method. Producing in spreadsheets is a cheaper and faster solution, but in matters involving numerous mobile devices, managing numerous spreadsheets can be unwieldly without the controls that can be put in place with a database-centered approach with near-native files.
Regardless of the production method that you choose, be mindful of several challenges for any production. A particularly notable challenge is redaction. If you are redacting parts of or entire messages, you need to make sure the message is truly redacted in both the metadata and near-native files. While there are innovative solutions for this in the market, no standard approach or set of best practices has been established. This means that any production guidelines should be carefully defined by both parties and that a thorough review of the production should be performed.
Conclusion
Options for getting mobile device data ready to analyze, review, produce, and ultimately present still are limited. Although tools and techniques continue to improve, at this stage a significant amount of manual and/or custom work often can be needed to get from that data the value you are seeking. As we gaze into out murky crystal balls, however, we see more than a little hope for the future of mobile device data review and production.
In our next post, we will go beyond review (at least in the traditional sense), to discuss strategies for managing non-communication mobile device data (e.g., web browser history or address books) and performing analytics of mobile device content.