Please use this identifier to cite or link to this item: https://hdl.handle.net/10955/67
Title: Querying Inconsistent Data: Repairs and Consistent Answers
Authors: Parisi, Francesco
Flesca, Sergio
Talia, Domenico
Keywords: Bacini idraulici
Issue Date: 9-Nov-2012
Series/Report no.: SSD ING-INF/05;
Abstract: In this dissertation we provide an extensive survey of the techniques for repairing and querying inconsistent relational databases. We distinguish four parameters for classifying and comparing of the existing techniques. First, we discern two repairing paradigms, namely the tuple-based and the attribute-based repairing paradigm. According to the former paradigm a re- pair for a database is obtained by inserting and=or deleting tuples, whereas according to the latter a repair is obtained by (also) modifying attribute values within tuples. Second, we distinguish several repair semantics which entail di®erent orders among the set of consistent database instances that can be obtained for an inconsistent database with respect to a given set of integrity constraints. Third, we classify the techniques on the basis of the classes of queries considered for computing consistent answers. Finally, we compare the di®erent approaches in literature on basis of the classes of integrity constraints which are assumed to be de¯ned on the database. 2) We investigate the problem of repairing and extracting reliable information from data violating a given set of aggregate constraints. These constraints consist of linear inequalities on aggregate-sum queries issued on measure values stored in the database. This syntactic form enables meaningful con- straints to be expressed. Indeed, aggregate constraints frequently occur in many real-life scenarios where guaranteeing the consistency of numerical data is mandatory. We consider database repairs consisting of sets of value-update opera- tions aiming at re-constructing the correct measure values of inconsistent data. We adopt two di®erent criteria for determining whether a set of update operations repairing data can be considered \reasonable" or not: set-minimal semantics and card-minimal semantics. Both these semantics aim at preserving the information represented in the source data as much as possible. They correspond to di®erent repairing strategies which turn out to be well-suited for di®erent application scenarios. We provide the complexity characterization of three fundamental prob- lems: (i) repairability: is there at least one (possible not minimal) repair for the given database with respect to the speci¯ed constraints? (ii) repair checking: given a set of update operations, is it a minimal repair? (iii) consistent query answer: is a given query true in every minimal repair? 3) We provide a method for computing card-minimal repairs for a database in presence of steady aggregate constraints, a restricted but expressive class of aggregate constraints. Under steady aggregate constraints, an instance of the problem of computing a card-minimal repair can be transformed into an instance of a Mixed-Integer Linear Programming (MILP) problem. Thus, standard techniques and optimizations addressing MILP problems can be re-used for computing a repairs. On the basis of this data-repairing framework, we propose an architecture providing robust data acquisition facilities from input documents contain- ing tabular data. We exploit integrity constraints de¯ned on the input data to support the detection and the repair of inconsistencies in the data arising from errors occurring in the acquisition phase performed on input data.
Description: Tesi di dottorato di ricerca in Ingegneria dei sistemi ed informatica, XIX ciclo
URI: http://hdl.handle.net/10955/67
Appears in Collections:Dipartimento di Ingegneria Informatica, Modellistica, Elettronica e Sistemistica - Tesi di Dottorato

Files in This Item:
File Description SizeFormat 
PhD_Thesis_Parisi_Francesco.pdf1,72 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.