According to a recent shareholder letter from…
Monday, October 12, 2020, 5:51 GMT
Jia Yangqing, the Vice President of Alibaba Group and Senior Fellow of Compute Platform, said: “Alibaba Cloud LakeHouse integrates the flexibility and rich ecosystem of data lakes with the enterprise-grade capabilities of data warehouses. This allows enterprises to build a new computing platform that integrates data lakes and data warehouses. Alibaba Cloud LakeHouse not only supports large-scale machine learning and deep learning, but also helps enterprises efficiently improve their big data capabilities, achieve agile operations, reduce costs, and improve efficiency.”
MaxCompute combines data warehouses that provide a single storage and computing infrastructure with data lakes that split cloud storage and computing, based on the original data warehouse infrastructure. The ultimate architecture of interconnected data warehouses and data lakes is eventually accomplished with this integration. While multiple storage systems coexist at the underlying layer in this architecture, via a single storage access layer and single metadata management, an optimized encapsulation interface is given for the upper-layer engine. A table with a table in a data lake can be joined in a data warehouse. Moreover, a single mid-end is provided by the overall architecture to ensure data protection and data management.
MaxCompute offers four main innovations in the process of technology adoption: accelerated adoption, centralized data and metadata processing, centralized expertise in development, and automated data warehousing, and continues to enhance core performance. In the 2020 benchmark TPCx-BB 100 TB (Intel Xeon Scalable Processor), MaxCompute lowered costs by 40 percent. In the 2020 TPCx-BB 30 TB (Intel Xeon Scalable Processor) benchmark, performance increased by over 50% and cost decreased by more than 30%.
“Weibo was an early” Alibaba Cloud LakeHouse “adopter. Weibo historically used Hadoop data lakes and data centers in the Alibaba Cloud, which are completely isolated by the cluster layer, and data does not flow freely to enable data computing. Weibo designed a mid-end AI computing that combines data lakes and warehouses to solve these difficult problems, based on Alibaba Cloud. This mid-end removes the tremendous burden of data migration and helps Weibo data engineers and algorithm engineers to use the proven large-scale computing resources and algorithms of Alibaba to maximize market performance quickly. A closed loop forms the MaxCompute cloud-based data warehouses (structured data) and data lakes (unstructured data), which greatly enhances the efficiency of the AI operation and produces great business value.
MaxCompute, the proprietary cloud-based data warehouse solution of Alibaba Cloud, stably supports Alibaba Group’s data collection and data computing services after nearly ten years of technological accumulation and is an important part of the big data ecosystem for cloud customers. The introduction of Alibaba Cloud LakeHouse provides companies with a more scalable, reliable , and cost-effective data platform approach for developing a new big data platform or updating the design of existing big data platforms. This accelerates companies’ digital restructuring.