EP18 - Best Practices for Modernizing Your Hadoop Workloads to AWS with Dremio

Gnarly Data Waves by Dremio

24-05-2023 • 37 Min.

Many organizations turned to HDFS to address the challenge of storing growing volumes of semi-structured and unstructured data. However, Hadoop never managed to replace the data warehouse for enterprise-grade Business Intelligence and Reporting, and most teams ended up with separate monolithic architectures including data lakes and data warehouses, with siloed data and analytic workloads That is why data teams are increasingly considering a data lakehouse architecture that combines the flexibility and scalability of data lake storage with the data management, data governance, and enterprise-grade analytic performance of the data warehouse. In this episode, Jorge A. Lopez, Product Specialist for Analytics at AWS, and Dremio's Jeremiah Morrow will discuss best practices for modernizing analytic workloads from Hadoop to an open data lakehouse architecture, including: - Choosing the right storage solution for your data lakehouse, and what features and functionality, such as performance, scalability reliabilty, and more, you should be evaluating. - Specific steps and best practices for gradually shifting on-premises workloads to a cloud data lakehouse while ensuring business continuity. - Consolidating data silos to achieve a complete view of your customer and operational data before, during, and after migration. See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #data #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #tags #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #sorting #tableformat #metastore #ApacheArrow #nessie #sonar #dremiosonar #optimization #automaticdata #aws #scalability