{"id":33095,"date":"2018-11-06T10:00:33","date_gmt":"2018-11-06T15:00:33","guid":{"rendered":"https:\/\/sdtimes.com\/?p=33095"},"modified":"2018-11-07T16:03:28","modified_gmt":"2018-11-07T21:03:28","slug":"getting-to-the-root-of-your-datas-history","status":"publish","type":"post","link":"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/","title":{"rendered":"Getting to the root of your data\u2019s history"},"content":{"rendered":"<p>With the rise of big <a href=\"https:\/\/sdtimes.com\/tag\/data\/\">data<\/a> platforms such as Apache Hadoop and Spark, more and more enterprises are pouring enterprise information into data lakes and launching related initiatives around data quality, data governance, regulatory compliance, and more reliable business intelligence (BI). To prevent the new lakes from turning into swamps, however, businesses are organizing their reams of data via the data\u2019s lineage.<\/p>\n<p>Enterprises have long managed and queried relational data in structured databases and data marts. \u00a0Emerging environments such as Hadoop, however, often bring together this information with semi-structured data from NoSQL databases, emails and XML documents as well as unstructured information like Microsoft Office files, web pages, videos, audio files, photos, social media messages, and satellite images.<\/p>\n<p>\u201cEven though data is becoming more accessible, users still rely on receiving data from trusted internal sources. For a company, it\u2019s important for users to know and understand the source and veracity of the data. Data lineage tools enable companies to track, audit and provide a visual of data movement from the source to the target, which also ties into the required data governance processes,\u201d said Sue Clark, senior CTO architect at Sungard AS, a customer of Informatica, Teradata, and Qlik.<\/p>\n<p>Through new laws like the General Data Protection Regulation (GDPR) and California Consumer Privacy Act (CCPA) of 2018, government regulators are requiring organizations to perform better management of data originating from all types of raw formats. Enterprises also face increasing demands from business managers for higher quality data for use in predictive analysis and other BI reports.<\/p>\n<p>\u201cToday, companies can\u2019t afford not to make data-driven decisions, which means understanding where data comes from &#8212; and how it has changed along the way &#8211; to solve business problems,\u201d according to Harald Smith, director of product management, at Syncsort, a specialist in data integration software and services.<\/p>\n<p>\u201cRegulatory compliance demands accuracy, and data lineage tools guarantee a significantly more accurate approach to data management,\u201d echoed Amnon Drori, founder and CEO \u00a0of Octopai, maker of an automated data lineage and metadata management search engine for BI groups.<\/p>\n<p>Data lineage tools also show up in self-service BI solutions, although apparently, such solutions aren\u2019t yet available to all that many users. In one recent study, TDWI found that only 20 percent of the companies surveyed said their personnel could identify trusted data sources on their own. Further, merely 18 percent responded that personnel could \u00a0\u201cdetermine data lineage \u2014 that is, who created the data set and where it came from \u2014 without close IT support,\u201d according to the report.<\/p>\n<p>\u201cIf users and analysts are to work effectively with self-service BI and analytics, they need to be confident that they can locate trusted data and know its lineage. \u00a0For self-service to prosper, IT and\/or the CDO function must help users by stewarding their experiences and pointing them to trusted, well-governed sources for their analysis,\u201d recommended TDWI.<\/p>\n<p>Even fewer of the respondents to TDWI\u2019s survey, or 16 percent, said their end users were able to query sources such as Hadoop clusters and data lakes \u2013 but then again, only about one-quarter of the participating organizations even had a data lake.<\/p>\n<p><b>What are data lineage tools, anyway?<\/b><br \/>\nDozens of proprietary and open-source vendors are converging on data lineage, from a bunch of different directions. Vendors, customers and analysts define data lineage tools in a wide variety of ways, but Gartner has arrived at one short yet highly serviceable definition.<\/p>\n<p>\u201cData lineage specifies the data\u2019s origins and where it moves over time. It also describes what happens to data as it goes through diverse processes. Data lineage can help to analyze how information is used and to track key bits of information that serve a particular purpose,\u201d according to Gartner\u2019s 2018 Magic Quadrant for Metadata Management Solutions (MMS) report.<\/p>\n<p>Muddying the definitional waters a bit is the fact that enterprises generally use data lineage tools within sweeping organizational initiatives. Accordingly, vendors often integrate these tools with related data management or BI functions, either within their own platforms or with partners\u2019 solutions. Customers also perform their own tool integrations.<\/p>\n<p>Some data lineage tools also transform, or convert, data into other formats, although other vendors perform these conversions through separate ETL (extract, transform, load) tools. Syncsort\u2019s DMX-h, for example, accesses data from the mainframe, RDBMS, or other legacy sources and provides security, management, and end-to-end lineage tracking. It also transforms legacy data sources into Hadoop-compatible formats.<\/p>\n<p>Beyond simply tracking data, for example, organizations need to be able to consume the data lineage information in a way that gives them a better understanding of what it means, said \u00a0Syncsort\u2019s Smith. Consequently, Syncsort recently teamed up with Cloudera to make its lineage information accessible through Cloudera Navigator, a data governance solution for Hadoop that collects audit logs from across the entire platform and maintains a full history, viewable through a graphical user interface (GUI) dashboard.<\/p>\n<p>For organizations that don\u2019t use Navigator, DMX-h makes the lineage information available through a REST-API, which IT departments can use for integration with other governance solutions.<\/p>\n<p><b>Some perform impact analysis<\/b><br \/>\nImpact analysis capabilities are also offered in some data lineage tools. \u00a0\u201cWith the implementation of GDPR, companies in possession of personal data of EU residents have had to make significant changes to ensure compliance. A large part of this pertains to access &#8212; giving people access to their own personal data, enabling portability of the data, changing or deleting the data,\u201d according to Drori.<\/p>\n<p>\u201cBefore any company can make a change to its data, it must first locate the data and then of course understand the impact of making a particular change. Data lineage tools are helping BI groups to perform impact analysis ahead of compliance with regulations like GDPR.\u201d<\/p>\n<p>In one real-world scenario, for example, a business analyst needed to erase PII, an age column in a particular report, so that customer age would become private. Data lineage tools helped to solve the problem.<\/p>\n<p>\u201cBefore erasing a column the analyst had to understand which processes were involved in creating this particular report and what kind of impact the deletion of this age column would have on other reports. Without data lineage tools, impact analysis can be really tricky and sometimes impossible to perform accurately,\u201d he told SD Times.<\/p>\n<p><b>Where does data lineage fit?<\/b><br \/>\nExperts slice and dice the data management and BI markets into myriad kinds of pieces. In characterizing where data lineage tools fit, major analyst firms such as Gartner and IDC place these tools in the general classification of metadata management.<\/p>\n<p><b>Gartner\u2019s take<\/b>. Beyond tools for data lineage and impact analysis, products in the metadata management category can include metadata repositories, or libraries; business glossaries; semantic frameworks; rules management tools; and tools for metadata ingestion and translation, according to Gartner.<\/p>\n<p>Tools in the latter category include techniques and bridges for various data sources such as ETL; BI and reporting tools; modeling tools; DBMS catalogs; ERP and other applications; XML formats; hardware and network log files; PDF and Microsoft Excel\/Word documents; business metadata; and custom metadata.<\/p>\n<p>Vendors who made it into Gartner\u2019s 2018 Magic Quadrant for MMS are as follows: Adaptive, Alation, Alex Solutions, ASG Technologies, Collibra, Data Advantage Group, Datum, Global IDs, IBM, Infogix, Informatica, Oracle, SAP, and Smartlogic.<\/p>\n<p>\u201cI don&#8217;t have exact adoption rates, but awareness of doing proper metadata management is growing. Initial resistance at the thought it would take away from agility is going away. Organizations can actually add new workloads much faster because the proper discipline is in place,\u201d said Sanjeev Mohan, a research analyst for big data and cloud\/SaaS at Gartner, during an interview with SD Times.<\/p>\n<p>Organizations, though, have differing reasons for engaging in data quality initiatives. Before launching an initiative and deciding on an approach to take, an enterprise should first determine the business use case, he advised. \u00a0\u201cIs it regulatory compliance? Risk reduction? Predictive analysis?\u201d<\/p>\n<p><b>IDC\u2019s views<\/b>. Stewart Bond, an IDC analyst, classifies metadata management tools as belonging to a larger category, called data intelligence software. Further, Bond views data intelligence software as a collection of capabilities which can help organizations answer fundamental questions about data. The list of is rather long, but it includes questions about when the data was created, who is currently using the data, where it resides, and why it exists, for example. The answers can inform and guide use cases around data governance, data quality management, and self-service data, he says.<\/p>\n<p>\u201cTo collect these answers, organizations must harness the power of metadata that is generated every time data is captured at a source, moves through an organization, is accessed by users, is profiled, cleansed, aggregated, augmented and used for analytics for operational or strategic decision-making. Data intelligence software goes beyond just metadata management, and includes data cataloging, master data definition and control, data profiling and data stewardship,\u201d Bond wrote in a recent blog.<\/p>\n<p>Data intelligence is a subset and different view of Data Integration and Integrity software (DIIS), another market view defined by IDC, according to Bond, who is research director of DIIS at IDC. \u201cData intelligence contains software for data profiling and stewardship, master data definition and control, data cataloging and data lineage \u2013 all which also map into the data quality, metadata management and master data segments in the full DIIS market,\u201d Bond told SD Times in an email.<\/p>\n<p>Examples of vendors included in IDC\u2019s data intelligence and DIIS views are Alation, ASG Technologies, BackOffice Associates, Collibra, Datum, IBM, Infogix, Informatica, Manta, Oracle, SAP, SAS, Syncsort, Tamr, TIBCO, Unifi, and Waterline Data.<\/p>\n<p>However, many products containing data lineage tools are not included in IDC\u2019s data intelligence and DIIS views, or in Gartner\u2019s MMS Magic Quadrant, typically because they don\u2019t meet the specific criteria for those categories and are covered by other areas of analysts\u2019 organizations.<\/p>\n<p><b>Which data lineage tools are best?<\/b><br \/>\nWith so many choices available, which data lineage tools will best meet your needs? Factors to consider include whether an initiative is IT- or business-driven, the types of additional data management or BI functionality that will be required, and whether using open-source software is important to the organization, experts say.<\/p>\n<p>Some IT-driven initiatives are concerned with pruning through and curating the organization\u2019s information into data catalogs, so that the most accurate data can then be reused through enterprise applications. \u00a0Other initiatives are sparked by business managers seeking to quickly put together consistent and reliable data sets for use within corporate departments or company-wide.<\/p>\n<p>For IT-driven initiatives, for example, Informatica provides data lineage through Metadata Manager, a key component of Informatica Power Center Advanced Edition. Metadata Manager gathers technical metadata from sources such as ETL and BI tools, applications, databases, data modeling tools, and mainframes.<\/p>\n<p>Metadata Manager shares a central repository with Informatica\u2019s Business Glossary. The technical metadata can be linked to business metadata created by Business Glossary to add context and meaning to data integration. Metadata Manager also provides a graphical view of the data as it moves through the integration environment.<\/p>\n<p>IT developers can use Metadata Manager to perform impact analysis when changes are made to metadata. \u00a0Enterprise data architects can use the solution\u2019s integration metadata catalog for purposes such as browsing and searching for metadata definitions, defining new custom models, and maintaining common interface definitions for data sources, warehouses, BI, and other applications used in enterprise data integration initiatives.<\/p>\n<p>In stark contrast, Datawatch targets its Monarch platform at business-driven initiatives. Monarch allows domain experts in business departments to pull metadata for documents in multiple formats \u2013 such as Excel spreadsheets, Oracle RDMS, and Salesforce.com, for example &#8212; and then use the metadata to build dashboard-driven models for reuse within their departments, said Jon Pilkington, Datawatch\u2019s CPO, in an interview with SD Times.<\/p>\n<p>Monarch\u2019s data lineage tools document \u201cwhere the raw data came from, how it&#8217;s been altered, who did it, when they did it,\u201d for instance. \u201cThe model then becomes what users search for and shop,\u201d Pilkington remarked.<\/p>\n<p>Monarch extracts the raw data in rows and columns. After it\u2019s extracted, a domain expert uses Monarch\u2019s point-and-click user interface to convert, clean, blend and enrich data without performing any coding or scripting. It can then be analyzed directly within Monarch or exported to Excel spreadsheets or third-party advanced analytics and visualization tools through the use of built-in connectors.<\/p>\n<p>Within its own marketing department, for example, Datawatch has used its tools to generate reports by salespeople about how information turns into a sales lead and how long it takes to turn a lead into a sale. \u201cWe use 11 different data sources for this \u2013 including Google Ad Words and the Zendesk support system \u2013 and the apps don\u2019t necessarily play well together. It took many steps for the domain expert to get the information into shape, but now that the model is done, it can be reused by any salesperson in the department.\u201d<\/p>\n<p><b>Three approaches to data management<\/b><br \/>\nAs Gartner\u2019s Sanjeev Mohan sees it, enterprises can take any of three approaches to data management initiatives: \u00a0customer-developed, mixing and matching best-of-breed tools, and investing in a broader platform or suite.<\/p>\n<p>By choosing a best-of-breed data lineage tool or metadata management package, a customer can achieve strong support for a specific use-case scenario, the analyst observed. On the other hand, customers often need to perform their own tool integrations, a process that can be expensive and time-consuming.<\/p>\n<p>Sungard AS is one example of an enterprise which is taking a best-of-breed approach. \u201cAs part of its internal handling of data and its sources, Sungard AS uses Teradata and Informatica, with Qlik on top of Teradata for ease of business user access and to make data-backed business decisions easier.\u201d Sungard\u2019s Sue Clark told SD Times.<\/p>\n<p><b>Open source vs. proprietary<\/b>. Most solutions offering data lineage capabilities are proprietary, said Gartner\u2019s Mohan. Yet some are open source, including offerings from Hortonworks, Cloudera, MapR and the now Google-owned Cask Data, in addition to Teradata\u2019s Kylo.<\/p>\n<p>\u201cWe don\u2019t like to lock customers into a specific vendor,\u201d said Shaun Bierweiler, vice president of U.S. Public Sector at Hortonworks, in an interview with SD Times.<\/p>\n<p>Hortonworks is now working with the United States Census Bureau to provide technology for the 2020 census, the first national census to be conducted in a mainly electronic way. HDP will serve as the Census Data Lake, storing most of the census data, while also acting as a staging ground for joining data from other databases. The Census Lake will store both structured data in addition to unstructured data such as street-level and aerial map imagery from Google.<\/p>\n<!-- AddThis Advanced Settings generic via filter on the_content --><!-- AddThis Share Buttons generic via filter on the_content -->","protected":false},"excerpt":{"rendered":"<p>With the rise of big data platforms such as Apache Hadoop and Spark, more and more enterprises are pouring enterprise information into data lakes and launching related initiatives around data quality, data governance, regulatory compliance, and more reliable business intelligence (BI). To prevent the new lakes from turning into swamps, however, businesses are organizing their  &hellip; <a class=\"read-more\" href=\"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/\">continue reading<\/a><!-- AddThis Advanced Settings generic via filter on get_the_excerpt --><!-- AddThis Share Buttons generic via filter on get_the_excerpt --><\/p>\n","protected":false},"author":744,"featured_media":33096,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"om_disable_all_campaigns":false,"cybocfi_hide_featured_image":"","footnotes":"","_links_to":"","_links_to_target":""},"categories":[1],"tags":[12021,14589,2567],"coauthors":[11506],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v23.8 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Getting to the root of your data\u2019s history - SD Times<\/title>\n<meta name=\"description\" content=\"Data lineage helps organizations understand where their data originated, who might have altered, and where it currently resides, to help keep data authenticated and private.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Getting to the root of your data\u2019s history - SD Times\" \/>\n<meta property=\"og:description\" content=\"Data lineage helps organizations understand where their data originated, who might have altered, and where it currently resides, to help keep data authenticated and private.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/\" \/>\n<meta property=\"og:site_name\" content=\"SD Times\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/SDTimesD2\" \/>\n<meta property=\"article:published_time\" content=\"2018-11-06T15:00:33+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-11-07T21:03:28+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/sdtimes.com\/wp-content\/uploads\/2018\/11\/dataLineage_H.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"660\" \/>\n\t<meta property=\"og:image:height\" content=\"371\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Jacqueline Emigh\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@sdtimes\" \/>\n<meta name=\"twitter:site\" content=\"@sdtimes\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Jacqueline Emigh\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"12 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/\"},\"author\":{\"name\":\"Jacqueline Emigh\",\"@id\":\"https:\/\/sdtimes.com\/#\/schema\/person\/d3b82fc17dfec49a99481487f4d0fddc\"},\"headline\":\"Getting to the root of your data\u2019s history\",\"datePublished\":\"2018-11-06T15:00:33+00:00\",\"dateModified\":\"2018-11-07T21:03:28+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/\"},\"wordCount\":2443,\"publisher\":{\"@id\":\"https:\/\/sdtimes.com\/#organization\"},\"image\":{\"@id\":\"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/sdtimes.com\/wp-content\/uploads\/2018\/11\/dataLineage_H.jpg\",\"keywords\":[\"data lakes\",\"Data lineage\",\"data management\"],\"articleSection\":[\"Latest News\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/\",\"url\":\"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/\",\"name\":\"Getting to the root of your data\u2019s history - SD Times\",\"isPartOf\":{\"@id\":\"https:\/\/sdtimes.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/sdtimes.com\/wp-content\/uploads\/2018\/11\/dataLineage_H.jpg\",\"datePublished\":\"2018-11-06T15:00:33+00:00\",\"dateModified\":\"2018-11-07T21:03:28+00:00\",\"description\":\"Data lineage helps organizations understand where their data originated, who might have altered, and where it currently resides, to help keep data authenticated and private.\",\"breadcrumb\":{\"@id\":\"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/#primaryimage\",\"url\":\"https:\/\/sdtimes.com\/wp-content\/uploads\/2018\/11\/dataLineage_H.jpg\",\"contentUrl\":\"https:\/\/sdtimes.com\/wp-content\/uploads\/2018\/11\/dataLineage_H.jpg\",\"width\":660,\"height\":371},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/sdtimes.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Getting to the root of your data\u2019s history\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/sdtimes.com\/#website\",\"url\":\"https:\/\/sdtimes.com\/\",\"name\":\"SD Times\",\"description\":\"Software Development News\",\"publisher\":{\"@id\":\"https:\/\/sdtimes.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/sdtimes.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/sdtimes.com\/#organization\",\"name\":\"SD Times\",\"url\":\"https:\/\/sdtimes.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/sdtimes.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/sdtimes.com\/wp-content\/uploads\/2014\/05\/deafaultlogo.png\",\"contentUrl\":\"https:\/\/sdtimes.com\/wp-content\/uploads\/2014\/05\/deafaultlogo.png\",\"width\":225,\"height\":90,\"caption\":\"SD Times\"},\"image\":{\"@id\":\"https:\/\/sdtimes.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/SDTimesD2\",\"https:\/\/x.com\/sdtimes\",\"https:\/\/www.linkedin.com\/company\/sdtimes\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/sdtimes.com\/#\/schema\/person\/d3b82fc17dfec49a99481487f4d0fddc\",\"name\":\"Jacqueline Emigh\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/sdtimes.com\/#\/schema\/person\/image\/48ad5a6934d017a7a0938970b69578a4\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/4897600bad29f8514a9f09291555e2b4?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/4897600bad29f8514a9f09291555e2b4?s=96&d=mm&r=g\",\"caption\":\"Jacqueline Emigh\"},\"description\":\"Jacqueline Emigh is a contributing editor for SD Times and ITOPs Times.\",\"url\":\"https:\/\/sdtimes.com\/author\/jacqueline-emigh\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Getting to the root of your data\u2019s history - SD Times","description":"Data lineage helps organizations understand where their data originated, who might have altered, and where it currently resides, to help keep data authenticated and private.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/","og_locale":"en_US","og_type":"article","og_title":"Getting to the root of your data\u2019s history - SD Times","og_description":"Data lineage helps organizations understand where their data originated, who might have altered, and where it currently resides, to help keep data authenticated and private.","og_url":"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/","og_site_name":"SD Times","article_publisher":"https:\/\/www.facebook.com\/SDTimesD2","article_published_time":"2018-11-06T15:00:33+00:00","article_modified_time":"2018-11-07T21:03:28+00:00","og_image":[{"width":660,"height":371,"url":"https:\/\/sdtimes.com\/wp-content\/uploads\/2018\/11\/dataLineage_H.jpg","type":"image\/jpeg"}],"author":"Jacqueline Emigh","twitter_card":"summary_large_image","twitter_creator":"@sdtimes","twitter_site":"@sdtimes","twitter_misc":{"Written by":"Jacqueline Emigh","Est. reading time":"12 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/#article","isPartOf":{"@id":"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/"},"author":{"name":"Jacqueline Emigh","@id":"https:\/\/sdtimes.com\/#\/schema\/person\/d3b82fc17dfec49a99481487f4d0fddc"},"headline":"Getting to the root of your data\u2019s history","datePublished":"2018-11-06T15:00:33+00:00","dateModified":"2018-11-07T21:03:28+00:00","mainEntityOfPage":{"@id":"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/"},"wordCount":2443,"publisher":{"@id":"https:\/\/sdtimes.com\/#organization"},"image":{"@id":"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/#primaryimage"},"thumbnailUrl":"https:\/\/sdtimes.com\/wp-content\/uploads\/2018\/11\/dataLineage_H.jpg","keywords":["data lakes","Data lineage","data management"],"articleSection":["Latest News"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/","url":"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/","name":"Getting to the root of your data\u2019s history - SD Times","isPartOf":{"@id":"https:\/\/sdtimes.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/#primaryimage"},"image":{"@id":"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/#primaryimage"},"thumbnailUrl":"https:\/\/sdtimes.com\/wp-content\/uploads\/2018\/11\/dataLineage_H.jpg","datePublished":"2018-11-06T15:00:33+00:00","dateModified":"2018-11-07T21:03:28+00:00","description":"Data lineage helps organizations understand where their data originated, who might have altered, and where it currently resides, to help keep data authenticated and private.","breadcrumb":{"@id":"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/#primaryimage","url":"https:\/\/sdtimes.com\/wp-content\/uploads\/2018\/11\/dataLineage_H.jpg","contentUrl":"https:\/\/sdtimes.com\/wp-content\/uploads\/2018\/11\/dataLineage_H.jpg","width":660,"height":371},{"@type":"BreadcrumbList","@id":"https:\/\/sdtimes.com\/data\/getting-to-the-root-of-your-datas-history\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/sdtimes.com\/"},{"@type":"ListItem","position":2,"name":"Getting to the root of your data\u2019s history"}]},{"@type":"WebSite","@id":"https:\/\/sdtimes.com\/#website","url":"https:\/\/sdtimes.com\/","name":"SD Times","description":"Software Development News","publisher":{"@id":"https:\/\/sdtimes.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/sdtimes.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/sdtimes.com\/#organization","name":"SD Times","url":"https:\/\/sdtimes.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/sdtimes.com\/#\/schema\/logo\/image\/","url":"https:\/\/sdtimes.com\/wp-content\/uploads\/2014\/05\/deafaultlogo.png","contentUrl":"https:\/\/sdtimes.com\/wp-content\/uploads\/2014\/05\/deafaultlogo.png","width":225,"height":90,"caption":"SD Times"},"image":{"@id":"https:\/\/sdtimes.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/SDTimesD2","https:\/\/x.com\/sdtimes","https:\/\/www.linkedin.com\/company\/sdtimes\/"]},{"@type":"Person","@id":"https:\/\/sdtimes.com\/#\/schema\/person\/d3b82fc17dfec49a99481487f4d0fddc","name":"Jacqueline Emigh","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/sdtimes.com\/#\/schema\/person\/image\/48ad5a6934d017a7a0938970b69578a4","url":"https:\/\/secure.gravatar.com\/avatar\/4897600bad29f8514a9f09291555e2b4?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/4897600bad29f8514a9f09291555e2b4?s=96&d=mm&r=g","caption":"Jacqueline Emigh"},"description":"Jacqueline Emigh is a contributing editor for SD Times and ITOPs Times.","url":"https:\/\/sdtimes.com\/author\/jacqueline-emigh\/"}]}},"_links":{"self":[{"href":"https:\/\/sdtimes.com\/wp-json\/wp\/v2\/posts\/33095"}],"collection":[{"href":"https:\/\/sdtimes.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sdtimes.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sdtimes.com\/wp-json\/wp\/v2\/users\/744"}],"replies":[{"embeddable":true,"href":"https:\/\/sdtimes.com\/wp-json\/wp\/v2\/comments?post=33095"}],"version-history":[{"count":3,"href":"https:\/\/sdtimes.com\/wp-json\/wp\/v2\/posts\/33095\/revisions"}],"predecessor-version":[{"id":33161,"href":"https:\/\/sdtimes.com\/wp-json\/wp\/v2\/posts\/33095\/revisions\/33161"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sdtimes.com\/wp-json\/wp\/v2\/media\/33096"}],"wp:attachment":[{"href":"https:\/\/sdtimes.com\/wp-json\/wp\/v2\/media?parent=33095"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sdtimes.com\/wp-json\/wp\/v2\/categories?post=33095"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sdtimes.com\/wp-json\/wp\/v2\/tags?post=33095"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/sdtimes.com\/wp-json\/wp\/v2\/coauthors?post=33095"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}