The online world fills databases with immense amounts of data. Your local grocery stores, financial institutions, streaming services, and even medical providers all keep vast tables of information in multiple databases.
Managing all this data is a big challenge. And the process of applying artificial intelligence to make inferences or apply logical rules or interpret information about that data can be time-consuming, especially when delays, called latencies, are also a major issue. Applications such as supply chain prediction, credit card fraud detection, providing chatbots for customer service, emergency service response, and healthcare consulting all require inferences in real time from data managed in a database.
The current lack of support for machine learning inference in existing databases means that a separate process and system is required and is particularly critical for certain applications, such as those mentioned above. Transferring data between two systems dramatically increases latency, and this delay makes it difficult to meet the time constraints of interactive applications seeking real-time results.
Jia Zou, an assistant professor of computer science and engineering at Arizona State University’s Ira A. Fulton Schools of Engineering, and her team of researchers offer a solution that, if successful, will dramatically reduce end-to-end latency. end for all-scale model used on data managed by a relational database.
“When performing inference on tens of millions of records, the time spent moving the input-output data can be 20 times longer than the inference time,” Zou says. “Such an increase in latency is unacceptable for many interactive analytics applications that require real-time decision making.”
Zou’s proposal, “Redesigning and Redesigning Analytical Databases for Serving Machine Learning Models,” won him a 2022 award from the National Science Foundation’s Early Career Development (CAREER) program.
Zou’s project aims to bridge the gap by designing a new database that seamlessly supports and optimizes the deployment, storage, and service of traditional machine learning models and deep neural network models.
“There are two general approaches to turning deep neural networks and traditional machine learning models into relational queries so that model inference and data queries are seamlessly unified in a single system,” Zou says. “The relation-centric approach decomposes each linear algebra operator into relational algebraic expressions and runs as a query graph composed of many fine-grained relational operators. The other approach is to encapsulate the entire model of machine learning in a coarse-grained user-defined function that performs as a single relational operator.
The main difference between the two approaches is that the relationship-centric approach accommodates large models but incurs high processing overhead, while the user-defined function-centric approach is more efficient but cannot scale to large models.
Zou’s proposed solution is to dynamically combine the two approaches by adaptively encapsulating sub-computations involving small-scale parameters into coarse-grained user-defined functions and mapping sub-computations involving large-scale parameters. scale in fine-grained relational algebraic expressions.
To do this, Zou — who teaches at the School of Computing and Augmented Intelligence, one of the seven Fulton schools — came up with a two-level intermediate representation, or IR, that supports the gradual lowering of machine learning models at any scale into scalable relational algebraic expressions and flexible yet parsable user-defined functions.
“Based on such a two-level IR, the proposed accuracy-aware query optimization and storage optimization techniques introduce the accuracy of model inference as a new dimension in the data space. database optimization,” says Zou. “This means that model inferences and data queries are more co-optimized in the same system.”
Zou says it is critical and urgent to integrate data management and model service to enable a broad class of applications that require data-intensive intelligence at interactive speed, such as fraud prediction by credit card, personalized customer support, disaster response and real-time recommendations used on all types of applications.
“For the past 10 years, I have focused on building data-intensive systems for machine learning, data analytics, and transaction processing in industry and academia,” says Zou. .
Zou and his lab have also worked with potential industrial users, such as the IBM Thomas J. Watson Research Center, as well as several academic users.
Going forward, she says, this work will provide new techniques to advance the logical optimization, physical optimization, and storage optimization of end-to-end machine learning inference workflows in a relational database.
“We will also develop and open a research prototype of the proposed system,” Zou said. “Furthermore, the project’s research results will enhance and integrate educational activities at the intersection of big data management and machine learning systems.”
As part of the NSF CAREER Award, Zou’s work will also support a Big Data Magic Week activity for underrepresented K-12 students and refugee youth in Arizona. The activity will be used as a platform to prepare selected ASU undergraduate students for international research competitions and will be integrated into an ASU graduate course on Data-Intensive Systems for Research. machine learning.
“ASU provides a collaborative environment for machine learning-related research, through which we can easily identify potential academic users of our research results,” Zou says. “The Fulton Schools of Engineering and (the School of Computing and Augmented Intelligence) provide excellent career development support for junior faculty.”