The 31st international conference on Legal Knowledge and Information Systems
December 12–14, 2018 in Groningen, The Netherlands
|12 December 2018||Workshops & Tutorials
|13 and 14 December 2018||Main conference|
The program is also available as a PDF file.
The proceedings are available at our publisher IOS Press.
The lecture will give an overview of deep learning techniques currently used in the processing and generation of natural language text. Interesting use cases are argumentation mining and argumentation generation. Next, we will discuss how deep learning and more specifically encoder-decoder architectures can be used in legal decision making, for instance, by translating language into a formal language used in automated decision processes. Moreover, the inverse process, that is, generating legal language from formal specifications, or rewriting free natural language into clear, explicit and consistent legal language could be interesting research avenues.
I will argue that responsible innovations in the field of AI and Law require methods, tools and conceptual frameworks that allow us to use moral values (e.g. regarding transparency, explainability, but also values implied by democracy, human rights, rule of law) as supra functional requirements for design. Research in applied ethics of technology has been conducted in the past decades that can be utilized to this end.
Two years after its entry into force, the EU General Data Protection Regulation became applicable on the 25th May 2018. Despite the long time for preparation, privacy policies of online platforms and services still often fail to comply with information duties and the standard of lawfulness of data processing. In this paper we present a new methodology for processing privacy policies under GDPR's provisions, and a novel annotated corpus, to be used by machine learning systems to automatically check the compliance and adequacy of privacy policies. Preliminary results confirm the potential of the methodology.
Legal contract analysis is an important research area. The classification of clauses or sentences enables valuable insights such as the extraction of rights and obligations. However, datasets consisting of contracts are quite rare, particularly regarding German language.
Therefore this paper experiments the portability of machine learning (ML) models with regard to different document types. We trained different ML classifiers on the tenancy law of the German Civil Code (BGB) to apply the resulting models on a set of rental agreements afterwards. The performance of our models varies on the contract set. Some models perform significantly worse, while certain settings reveal a portability. Additionally, we trained and evaluated the same classifiers on a dataset consisting solely of contracts, to be able to observe a reference performance. We could show that the performance of ML models may depend on the document type used for training, while certain setups result in portable models.
This paper is concerned with the task of finding majority opinion (MO) in UK House of Lords (UKHL) case law by analysing agreement statements (AS) that explicitly express the appointed judges' acceptance of each other's reasoning. We introduce a corpus of 300 UKHL cases in which the relevant AS and MO have been annotated by three legal experts; and we introduce an AI system that automatically identifies this AS and MO with a performance comparable to humans.
In this work, we outline an approach for question answering over regulatory documents. In contrast to traditional means to access information in the domain, the proposed system attempts to deliver an accurate and precise answer to user queries. This is accomplished by a two-step approach which first selects relevant paragraphs given a question; and then compares the selected paragraph with user query to predict a span in the paragraph as the answer. We employ neural network based solutions for each step, and compare them with existing, and alternate baselines. We perform our evaluations with a gold-standard benchmark comprising over 600 questions on the MaRisk regulatory document. In our experiments, we observe that our proposed system outperforms other baselines.
In this paper, we propose a structured approach for transforming legal arguments to a Bayesian network (BN) graph. Our approach automatically constructs a fully specified BN graph by exploiting causality information present in legal arguments. Moreover, we demonstrate that causality information in addition provides for constraining some of the probabilities involved. We show that for undercutting attacks it is necessary to distinguish between causal and evidential attacked inferences, which extends on a previously proposed solution to modelling undercutting attacks in BNs. We illustrate our approach by applying it to part of an actual legal case, namely the Sacco and Vanzetti legal case.
Representation and reasoning over legal rules is an important application domain and a number of related approaches have been developed. In this work, we investigate legal reasoning in practice based on three use cases of increasing complexity. We consider three representation and reasoning approaches: (a) Answer Set Programming, (b) Argumentation and (c) Defeasible Logic. Representation and reasoning approaches are evaluated with respect to semantics, expressiveness, efficiency, complexity and support.
GDPR abiding blockchain systems are feasible. Jurists, programmers, and other experts are increasingly working on this aim nowadays. Still, manifold blockchain networks functioning out there suggest a new generation of data protection issues brought about by this technology. Some of these issues will likely concern the right to erasure set up by Art. 17 of the EU data protection regulation (‘GDPR’). These cases will soon be discussed before national authorities and courts, and will likely test the technical solutions explored in this paper, such as hashing-out methods, keys destruction, chameleon hash functions, and more. By taking into account matters of design and the complex architecture of blockchains, we shall distinguish between blockchains that have been thought about to expressly meet the requirements of the EU regulation, and blockchains that, for one reason or another, e.g. ante GDPR designed blockchains, trigger some sort of clash with the legal order, that is, (i) matters of principle on e.g. political decentralization; (ii) standards on security and data protection; (iii) a mix of them; and, (iv) social clash. It is still unclear how the interplay of legal regulation, technological constraints, social norms, and market interests, will end up in this context. Rulings and court orders will be instructive. It is a clash foretold, after all.
We discuss the lessons learned from implementing a CATO style system using factors with magnitude. In particular we identify that giving factors magnitudes enables a diversity of reasoning styles and arguments. We distinguish a variety of ways in which factors combine to determine abstract factors. We discuss several different roles for values. Finally we identify the additional value related information required to produce a working program: thresholds and weights as well as a simple preference ordering.
We investigate named entity recognition in Greek legislation using state-of-the-art deep neural network architectures. The recognized entities are used to enrich the Greek legislation knowledge graph with more detailed information about persons, organizations, geopolitical entities, legislation references, geographical landmarks and public document references. We also interlink the textual references of the recognized entities to the corresponding entities represented in other open public datasets and, in this way, we enable new sophisticated ways of querying Greek legislation. Relying on the results of the aforementioned methods we generate and publish a new dataset of geographical landmarks mentioned in Greek legislation. We make available publicly all datasets and other resources used in our study. Our work is the first of its kind for the Greek language in such an extended form and one of the few that examines legal text in a full spectrum, for both entity recognition and linking.
This paper introduces PrOnto, the privacy ontology that models the GDPR main conceptual cores: data types and documents, agents and roles, processing purposes, legal bases, processing operations, and deontic operations for modelling rights and duties. The explicit goal of PrOnto is to support legal reasoning and compliance checking by employing defeasible logic theory (i.e., the LegalRuleML standard and the SPINDle engine).
We propose a method that assists legislation officers in finding inappropriate Japanese legal terms in Japanese statutory sentences and suggests corrections. In particular, we focus on sets of similar legal terms whose usages are defined in legislation drafting rules. Our method predicts suitable legal terms in statutory sentences using Random Forest classifiers, each of which is optimized for each set of similar legal terms. Our experiment shows that our method outperformed existing modern word prediction methods using neural language models.
In the last fifteen years, Semantic Web technologies have been successfully applied to the legal domain. By composing all those techniques and theoretical methods, we propose an integrated framework for modelling legal documents and legal knowledge to support legal reasoning, in particular checking compliance. This paper presents a proof-of-concept applied to the GDPR domain, with the aim to detect infringements of privacy compulsory norms or to prevent possible violations using BPMN and Regorous engine.
In common law jurisdictions, legal research often involves an analysis of relevant case law. Court opinions comprise several high-level parts with different functions. A statement's membership in one of the parts is a key factor influencing how the statement should be understood. In this paper we present a number of experiments in automatically segmenting court opinions into the functional and the issue specific parts. We defined a set of seven types including Background, Analysis, and Conclusions. We used the types to annotate a sizable corpus of US trade secret and cyber crime decisions. We used the data set to investigate the feasibility of recognizing the parts automatically. The proposed framework based on conditional random fields proved to be very promising in this respect. To support research in automatic case law analysis we plan to release the data set to the public.
Smart contracts have been proposed as executable implementations enforcing real-life contracts. Unfortunately, the semantic gap between these allows for the smart contract to diverge from its intended deontic behaviour. In this paper we show how a deontic contract can be used for real-time monitoring of smart contracts specifically and request-based interactive systems in general, allowing for the identification of any violations. The deontic logic of actions we present takes into account the possibility of action failure (which we can observe in smart contracts), allowing us to consider novel monitorable semantics for deontic norms. For example, taking a rights-based view of permissions allows us to detect the violation of a permission when a permitted action is not allowed to succeed. A case study is presented showing this approach in action for Ethereum smart contracts.
The paper presents three experimental platforms for legal analytics, online environments integrating heterogeneous computational heuristics, information processing, and visualization techniques to extract actionable knowledge from legal data. Our goal is to explore innovative approaches to issues spanning from information retrieval to the quantitative analysis of legal corpora or to the study of criminal organizations for research and investigative purposes. After a brief introduction to the e-science paradigm and to the role played in it by research platforms, we focus on visual analytics as a viable way to interact with legal data. We then present the tools, their main features and the results so far obtained. The paper ends up with some considerations about the computational turn of science and its role in promoting a much needed interdisciplinary and empirical evolution of legal research.
In this paper we report on the experience gathered in producing two gold-standard alignment datasets between the European Union thesaurus EuroVoc and two other notable resources adopted in legal environments: the thesaurus of the Italian Senate TESEO and the IATE European terminological resource. The realization of these two resources has been performed in the context of the PMKI project, an European Commission action aiming at creating a Public Multilingual Knowledge management Infrastructure to support e-commerce solutions in a multilingual environment. As of the numerous lexical and terminological resources involved in this project, ontology and thesaurus alignment and, as a consequence, the evaluation of automatically generated alignments, play a pivotal role for the success of the project.
In this paper an automated solution for finding cases for analysing the impact of legal change is proposed and the results are analysed with the help of a legal expert. It focuses on the automatic classification of 15,000 judgements within civil law. We investigated to what extent several machine learning algorithms were able to classify cases ‘correctly’. This was done with accuracies around 0.85. However, the data were scarce and the initial labelling not perfect, so further research should focus on these aspects to improve the analysis of the impact of legal change.
This work investigates legal concepts and their expression in Portuguese, concentrating on the “Order of Attorneys of Brazil” Bar exam. Using a corpus formed by a collection of multiple-choice questions, three norms related to the Ethics part of the OAB exam, language resources (Princeton WordNet and OpenWordNet-PT) and tools (AntConc and Freeling), we began to investigate the concepts and words missing from our repertory of concepts and words in Portuguese, the knowledge base OpenWordNet-PT. We add these concepts and words to OpenWordNet-PT and hence obtain a representation of these texts that is mostly “contained” in the lexical knowledge base.
Legal standards for suspicion involve seemingly limitless possible factors, leaving them vague and subject to concerns of illegitimate biases by decision makers. Beginning with the relatively small number of factors present in drug interdiction stops, a model can be developed that not only predicts judicial behavior but the odds of discovering drugs. This technology will require legislatures or judges to begin the process of determining what numerical threshold of suspicion justifies investigatory detentions and searches.
We demonstrate CISpaces.org, a tool to support situational understanding in intelligence analysis that complements but not replaces human expertise, for the first time applied to a judicial context. The system combines argumentation-based reasoning and natural language generation to support the creation of analysis and summary reports, and to record the process of forming hypotheses from relationships among information.
This paper presents a novel method to address legal rights for children through a chatbot framework by integrating machine learning, a dialogue graph, and information extraction. The method addresses a significant problem: we cannot presume that children have common knowledge about their rights or express themselves as an adult might. In our framework, a chatbot user begins a conversation, where based on the circumstance described, a neural network predicts both speech acts, relating to a dialogue graph, and legal types. Information is extracted in order to create a case for a legal advisor. In collaboration with the Children's Legal Centre Wales, who advocate for the improvement of legal rights in Wales, a corpus has been constructed and a prototype chatbot developed. The framework has been evaluated with classification measures and a user study.
In the fact of growing number of cases, Chinese courts have gradually formed a trial mode to improve the efficiency of trials by conducting trials around the controversial issues. However, identifying the controversy issue in specific cases is not only affected by the uncertainty of facts and laws, but also by the discretion of the judges and extra-case factors, and cannot be expressed as a standard format, which lead to the controversial issues based case retrieval a challenge problem. In this paper, we propose a controversial issues merging algorithm based on K-means clustering for Chinese legal texts. The proposed algorithm can determine the number of clusters of the given cause of action automatically and merge the controversial issues semantically, which makes the case information retrieval more accurate and effective.
One puzzle studied in AI & Law is how arguments, rules and cases are formally connected. Recently a formal theory was proposed formalizing how the validity of arguments based on rules can be grounded in cases. Three kinds of argument validity were distinguished: coherence, presumptive validity and conclusiveness. In this paper the theory is implemented in a Prolog program, used to evaluate a previously developed model of Dutch tort law. We also test the theory and its implementation with a new case study modelling Chinese copyright infringement law. In this way we illustrate that by the use of the implementation the process of modelling becomes more efficient and less error-prone.
This paper describes a new method to extract relevant keywords from patent claims, as part of the task of retrieving other patents with similar claims (search for prior art). The method combines a qualitative analysis of the writing style of the claims with NLP methods to parse text, in order to represent a legal text as a specialization arborescence of terms. In this setting, the set of extracted keywords are yielding better search results than keywords extracted with traditional method such as tf-idf.
In this work we enrich a formalism for argumentation by including a formal characterization of features related to the knowledge, in order to capture proper reasoning in legal domains. We add meta-data information to the arguments in the form of labels representing quantitative and qualitative data about them. These labels are propagated through an argumentative graph according to the relations of support, conflict, and aggregation between arguments.
The decision whether to accept or reject a new case is a well established task undertaken in legal work. This task frequently necessitates domain knowledge and is consequently resource expensive. In this paper it is proposed that early rejection/acceptance of at least a proportion of new cases can be effectively achieved without requiring significant human intervention. The paper proposes, and evaluates, five different AI techniques whereby early case reject-accept can be achieved. The results suggest it is possible for at least a proportion of cases to be processed in this way.
The growing amount of textual data in the legal domain leads to a demand for better text analysis tools adapted to legal domain specific use cases. Semantic Text Matching (STM) is the general problem of linking text fragments of one or more document types. The STM problem is present in many legal document analysis tasks, such as argumentation mining. A common solution approach to the STM problem is to use text similarity measures to identify matching text fragments. In this paper, we recapitulate the STM problem and a use case in German tenancy law, where we match tenancy contract clauses and legal comment chapters. We propose an approach similar to local interpretable model-agnostic explanations (LIME) to better understand the behavior of text similarity measures like TFIDF and word embeddings. We call this approach eXplainable Semantic Text Matching (XSTM).
Governments across the world are testing different uses of the blockchain for the delivery of their public services. Blockchain hashing–or the insertion of data in the blockchain–is one of the potential applications of the blockchain in this space. With this method, users can apply special scripts to add their data to blockchain transactions, ensuring both immutability and publicity. Blockchain hashing also secures the integrity of the original data stored on central governmental databases. The paper starts by analysing possible scenarios of hashing on the blockchain and assesses in which cases it may work and in which it is less likely to add value to a public administration. Second, the paper also compares this method with traditional digital signatures using PKI (Public Key Infrastructure) and discusses standardisation in each domain. Third, it also addresses issues related with concepts such as “distributed ledger technology” and “permissioned blockchains.” Finally, it raises the question of whether blockchain hashing is an effective solution for electronic governance, and concludes that its value is controversial, even if it is improved by PKI and other security measures. In this regard, we claim that governments need first to identify pain points in governance, and then consider the trade-offs of the blockchain as a potential solution versus other alternatives.