Databases and Data Mining

Faculty members: Wei-Shinn Ku, David Umphress, Dean Hendrix, and Weikuan Yu

Dr. Wei-Shinn Ku’s research interests include databases, security and privacy, and mobile computing with an emphasis on spatial and mobile data management, location privacy protection, and location-based services (LBS). Currently, Dr. Ku is working on two projects related to RFID data cleansing and database outsourcing.

Radio Frequency Identification (RFID) is an electronic tagging technology that allows objects to be automatically identified at a distance without a direct line-of-sight, using an electromagnetic challenge/response exchange. RFID technology facilitates numerous applications such as inventory management, transportation payments, animal identification, product tracking, etc. However, practitioners are facing a challenging problem: The raw data collected by RFID readers is inherently unreliable. The RFID tag reading rate is often in the 60% - 70% range and false positive readings (i.e., tags mistakenly assumed to be present) are possible in real world environments. False positives give rise to database inconsistency problems, for example, the readings may indicate an item’s presence at two locations at the same time. With incomplete and erroneous raw RFID data, it is very difficult to efficiently support high-level applications such as warehouse management. Currently, Dr. Ku is developing novel solutions that use Markov Chain Monte Carlo methods for RFID data cleansing.

Due to the rapid advancements in network technology, the cost of transmitting a terabyte of data over long distances has decreased significantly in the past five years. Consequently, there is a growing interest in outsourcing database management tasks to third parties that can provide these tasks for a much lower cost due to economy of scale. Providing query integrity assurance for accessing outsourced databases is a new and challenging research topic. Particularly, a severe challenge is triggered by the popularity of mobile computing – more and more users are accessing database services from mobile devices which have limited resources. None of the existing query integrity authentication methods can be effectively applied to resource limited mobile devices. Therefore, developing query integrity auditing techniques that are capable of supporting mobile applications which access outsource databases is a very intriguing and important research issue. Dr. Ku aims at developing lightweight query integrity assurance solutions which can be adopted by mobile devices for significantly increasing the security and usage of mobile applications.

Dr. David Umphress has been working in the field of data mining and predictive analytics, with the ultimate goal of creating a permanent center for such research. He is currently engaged on a project with the Alabama Department of Revenue. The project investigates ways in which data mining and predictive analytics can assist ADOR in identifying tax noncompliance and meaningfully affect future adherence to tax laws.

As we enter the information age, we are bombarded with an explosive amount of data from many computer applications. Such data will not be valuable until we have a way to quickly search and identify key insights therein. For example, how to quickly look up certain genomic and proteomic sequences with targeted characteristics is a prerequisite for processing gigantic biological sequences that are produced by various genome projects. In collaboration with Lawrence Berkeley National Laboratory, Dr. Weikuan Yu and his group are working on a research project that aims to provide cutting-edge data indexing and analytic tools for fast scientific query and knowledge discoveries. Co-op opportunities on this project are also available from Berkeley lab for undergraduate and graduate students who have demonstrated strong research performance and potential.

Last Updated: Feb 09, 2011