IPP Symposium

Big Queries, not (only) Big Data

Daniela Florescu, Researcher, Oracle

Technology and fashion have something in common: the media tells people what should they buy at any given time. A flurry of research and development is being targeted today towards "big data" management; it's the hot fashion of the day. My talk will argue that this is not the most interesting research data management problem, nor the most challenging. The database community has been parallelizing select, projects, joins, sorts and group-bys for 40 years, if not longer. We know how to do that.

I will argue that the combination of (a) very complex data processing (which I would call "big queries") on (b) large amounts of data, is the interesting and challenging research problem of the day. We do not know how to do that yet, and we need to do it. But what does this mean, and what is the impact ? I will discuss side-effects on system architectures, query/processing languages, compilation/optimization, database architectures.

Dr. Daniela Florescu has a master in Mathematics and a PhD in Computer Science from University of Paris VI. In her 20 years of experience, she been a researcher (INRIA, ATT Research, IBM Research, Oracle), entrepreneur (28msec.com, xqrl.com), and, in between, worked in both startups and large companies (Oracle, BEA). Last but not least, she was a pioneer of semi-structured data management (when the database world did not want to think that there is such a thing). This resulted in 15 years of experience authoring the standard query language for XML, XQuery, and the new JSON query language, JSONiq. In Oracle she in charge with the Zorba open source project, which attempts to put into practice the big queries principles.