[{"content": "Reconstructing Neural Parameters and Synapses of arbitrary interconnected Neurons from their Simulated Spiking Activity\nTo understand the behavior of a neural circuit it is a presupposition that we have a model of the dynamical system describing this circuit. This model is determined by several parameters, including not only the synaptic weights, but also the parameters of each neuron. Existing works mainly concentrate on either the synaptic weights or the neural parameters. In this paper we present an algorithm to reconstruct all parameters including the synaptic weights of a spiking neuron model. The model based on works of Eugene M. Izhikevich [1] consists of two differential equations and covers different types of cortical neurons. It combines the dynamical properties of Hodgkin-Huxley-type dynamics with a high computational efficiency. The presented algorithm uses the recordings of the corresponding membrane potentials of the model for the reconstruction and consists of two main components. The first component is a rank based Genetic Algorithm (GA) which is used to find the neural parameters of the model. The second one is a Least Mean Squares approach which computes the synaptic weights of all interconnected neurons by minimizing the squared error between the calculated and the measured membrane potentials for each timestep. In preparation for the reconstruction of the neural parameters and of the synaptic weights from real measured membrane potentials, promising results based on simulated data generated with a randomly parametrized Izhikevich model are presented. The reconstruction does not only converge to a global minimum of neural parameters, but also approximates the synaptic weights with high precision.", "file_path": "./data/paper/Fischer/1608.06132.pdf", "title": "Reconstructing Neural Parameters and Synapses of arbitrary interconnected Neurons from their Simulated Spiking Activity", "abstract": "To understand the behavior of a neural circuit it is a presupposition that we have a model of the dynamical system describing this circuit. This model is determined by several parameters, including not only the synaptic weights, but also the parameters of each neuron. Existing works mainly concentrate on either the synaptic weights or the neural parameters. In this paper we present an algorithm to reconstruct all parameters including the synaptic weights of a spiking neuron model. The model based on works of Eugene M. Izhikevich [1] consists of two differential equations and covers different types of cortical neurons. It combines the dynamical properties of Hodgkin-Huxley-type dynamics with a high computational efficiency. The presented algorithm uses the recordings of the corresponding membrane potentials of the model for the reconstruction and consists of two main components. The first component is a rank based Genetic Algorithm (GA) which is used to find the neural parameters of the model. The second one is a Least Mean Squares approach which computes the synaptic weights of all interconnected neurons by minimizing the squared error between the calculated and the measured membrane potentials for each timestep. In preparation for the reconstruction of the neural parameters and of the synaptic weights from real measured membrane potentials, promising results based on simulated data generated with a randomly parametrized Izhikevich model are presented. The reconstruction does not only converge to a global minimum of neural parameters, but also approximates the synaptic weights with high precision.", "keywords": ["spiking neuron model", "Izhikevich model reconstruction", "synaptic weight estimation", "Genetic Algorithm", "Least Mean Squares", "parameter estimation"], "author": "Fischer"}, {"content": "About Learning in Recurrent Bistable Gradient Networks\nRecurrent Bistable Gradient Networks [1], [2], [3] are attractor based neural networks characterized by bistable dynamics of each single neuron. Coupled together using linear interaction determined by the interconnection weights, these networks do not suffer from spurious states or very limited capacity anymore. Vladimir Chinarov and Michael Menzinger, who invented these networks, trained them using Hebb's learning rule. We show, that this way of computing the weights leads to unwanted behaviour and limitations of the networks capabilities. Furthermore we evince, that using the first order of Hintons Contrastive Divergence CD 1 algorithm [4] leads to a quite promising recurrent neural network. These findings are tested by learning images of the MNIST database for handwritten numbers.", "file_path": "./data/paper/Fischer/1608.08265.pdf", "title": "About Learning in Recurrent Bistable Gradient Networks", "abstract": "Recurrent Bistable Gradient Networks [1], [2], [3] are attractor based neural networks characterized by bistable dynamics of each single neuron. Coupled together using linear interaction determined by the interconnection weights, these networks do not suffer from spurious states or very limited capacity anymore. Vladimir Chinarov and Michael Menzinger, who invented these networks, trained them using Hebb's learning rule. We show, that this way of computing the weights leads to unwanted behaviour and limitations of the networks capabilities. Furthermore we evince, that using the first order of Hintons Contrastive Divergence CD 1 algorithm [4] leads to a quite promising recurrent neural network. These findings are tested by learning images of the MNIST database for handwritten numbers.", "keywords": [], "author": "Fischer"}, {"content": "Supporting Agile Reuse Through Extreme Harvesting\nAgile development and software reuse are both recognized as effective ways of improving time to market and quality in software engineering. However, they have traditionally been viewed as mutually exclusive technologies which are difficult if not impossible to use together. In this paper we show that, far from being incompatible, agile development and software reuse can be made to work together and, in fact, complement each other. The key is to tightly integrate reuse into the test-driven development cycles of agile methods and to use test cases-the agile measure of semantic acceptability-to influence the component search process. In this paper we discuss the issues involved in doing this in association with Extreme Programming, the most widely known agile development method, and Extreme Harvesting, a prototype technique for the test-driven harvesting of components from the Web. When combined in the appropriate way we believe they provide a good foundation for the fledgling concept of agile reuse.", "file_path": "./data/paper/Hummel/Support Agile.pdf", "title": "Supporting Agile Reuse Through Extreme Harvesting", "abstract": "Agile development and software reuse are both recognized as effective ways of improving time to market and quality in software engineering. However, they have traditionally been viewed as mutually exclusive technologies which are difficult if not impossible to use together. In this paper we show that, far from being incompatible, agile development and software reuse can be made to work together and, in fact, complement each other. The key is to tightly integrate reuse into the test-driven development cycles of agile methods and to use test cases-the agile measure of semantic acceptability-to influence the component search process. In this paper we discuss the issues involved in doing this in association with Extreme Programming, the most widely known agile development method, and Extreme Harvesting, a prototype technique for the test-driven harvesting of components from the Web. When combined in the appropriate way we believe they provide a good foundation for the fledgling concept of agile reuse.", "keywords": [], "author": "Hummel"}, {"content": "Iterative and Incremental Development of Component-Based Software Architectures\nWhile the notion of components has had a major positive impact on the way software architectures are conceptualized and represented, they have had relatively little impact on the processes and procedures used to develop software systems. In terms of software development processes, use case-driven iterative and incremental development has become the predominant paradigm, which at best ignores components and at worse is even antagonistic to them. However, use-case driven, I&I development (as popularized by agile methods) and component-based development have opposite strengths and weaknesses. The former's techniques for risk mitigation and prioritization greatly reduce the risks associated with software engineering, but often give rise to suboptimal architectures that emerge in a semi-ad hoc fashion over time. In contrast, the latter gives rise to robust, optimized architectures, but to date has poor process support. In principle, therefore, there is a lot to be gained by fundamentally aligning the core principles of component-based and I&I development into a single, unified development approach. In this position paper we discuss the key issues involved in attaining such a synergy and suggest some core ideas for merging the principles of componentbased and I&I development.", "file_path": "./data/paper/Hummel/2304736.2304750.pdf", "title": "Iterative and Incremental Development of Component-Based Software Architectures", "abstract": "While the notion of components has had a major positive impact on the way software architectures are conceptualized and represented, they have had relatively little impact on the processes and procedures used to develop software systems. In terms of software development processes, use case-driven iterative and incremental development has become the predominant paradigm, which at best ignores components and at worse is even antagonistic to them. However, use-case driven, I&I development (as popularized by agile methods) and component-based development have opposite strengths and weaknesses. The former's techniques for risk mitigation and prioritization greatly reduce the risks associated with software engineering, but often give rise to suboptimal architectures that emerge in a semi-ad hoc fashion over time. In contrast, the latter gives rise to robust, optimized architectures, but to date has poor process support. In principle, therefore, there is a lot to be gained by fundamentally aligning the core principles of component-based and I&I development into a single, unified development approach. In this position paper we discuss the key issues involved in attaining such a synergy and suggest some core ideas for merging the principles of componentbased and I&I development.", "keywords": ["D", "2", "11 [Software]: Software Architectures Component-based software architectures, iterative and incremental software development"], "author": "Hummel"}, {"content": "Using Cultural Metadata for Artist Recommendations\nOur approach to generate recommendations for similar artists follows a recent tradition of authors tackling the problem not with content-based audio analysis. Following this novel procedure we rely on the acquisition, filtering and condensing of unstructured text-based information that can be found in the web. The beauty of this approach lies in the possibility to access so-called cultural metadata that is the agglomeration of several independentoriginally subjective-perspectives about music.", "file_path": "./data/paper/Hummel/Using_cultural_metadata_for_artist_recommendations.pdf", "title": "Using Cultural Metadata for Artist Recommendations", "abstract": "Our approach to generate recommendations for similar artists follows a recent tradition of authors tackling the problem not with content-based audio analysis. Following this novel procedure we rely on the acquisition, filtering and condensing of unstructured text-based information that can be found in the web. The beauty of this approach lies in the possibility to access so-called cultural metadata that is the agglomeration of several independentoriginally subjective-perspectives about music.", "keywords": [], "author": "Hummel"}, {"content": "A Collection of Software Engineering Challenges for Big Data System Development\nIn recent years, the development of systems for processing and analyzing large amounts of data (so-called Big Data) has become an important sub-discipline of software engineering. However, to date there exits no comprehensive summary of the specific idiosyncrasies and challenges that the development of Big Data systems imposes on software engineers. With this paper, we aim to provide a first step towards filling this gap based on our collective experience from industry and academic projects as well as from consulting and initial literature reviews. The main contribution of our work is a concise summary of 26 challenges in engineering Big Data systems, collected and consolidated by means of a systematic identification process. The aim is to make practitioners more aware of common challenges and to offer researchers a solid baseline for identifying novel software engineering research directions.", "file_path": "./data/paper/Hummel/GI_AK_BDA_Challenges-preprint.pdf", "title": "A Collection of Software Engineering Challenges for Big Data System Development", "abstract": "In recent years, the development of systems for processing and analyzing large amounts of data (so-called Big Data) has become an important sub-discipline of software engineering. However, to date there exits no comprehensive summary of the specific idiosyncrasies and challenges that the development of Big Data systems imposes on software engineers. With this paper, we aim to provide a first step towards filling this gap based on our collective experience from industry and academic projects as well as from consulting and initial literature reviews. The main contribution of our work is a concise summary of 26 challenges in engineering Big Data systems, collected and consolidated by means of a systematic identification process. The aim is to make practitioners more aware of common challenges and to offer researchers a solid baseline for identifying novel software engineering research directions.", "keywords": [], "author": "Hummel"}, {"content": "Structuring Software Reusability Metrics for Component-Based Software Development\nThe idea of reusing software components has been present in software engineering for several decades. Although the software industry developed massively in recent decades, component reuse is still facing numerous challenges and lacking adoption by practitioners. One of the impediments preventing efficient and effective reuse is the difficulty to determine which artifacts are best suited to solve a particular problem in a given context and how easy it will be to reuse them there. So far, no clear framework is describing the reusability of software and structuring appropriate metrics that can be found in literature. Nevertheless, a good understanding of reusability as well as adequate and easy to use metrics for quantification of reusability are necessary to simplify and accelerate the adoption of component reuse in software development. Thus, we propose an initial version of such a framework intended to structure existing reusability metrics for component-based software development that we have collected for this paper.", "file_path": "./data/paper/Hummel/Structuring_Software_Reusability_Metrics.pdf", "title": "Structuring Software Reusability Metrics for Component-Based Software Development", "abstract": "The idea of reusing software components has been present in software engineering for several decades. Although the software industry developed massively in recent decades, component reuse is still facing numerous challenges and lacking adoption by practitioners. One of the impediments preventing efficient and effective reuse is the difficulty to determine which artifacts are best suited to solve a particular problem in a given context and how easy it will be to reuse them there. So far, no clear framework is describing the reusability of software and structuring appropriate metrics that can be found in literature. Nevertheless, a good understanding of reusability as well as adequate and easy to use metrics for quantification of reusability are necessary to simplify and accelerate the adoption of component reuse in software development. Thus, we propose an initial version of such a framework intended to structure existing reusability metrics for component-based software development that we have collected for this paper.", "keywords": ["Software Reusability", "Software Reusability Metrics", "Component-Based Software Development. I"], "author": "Hummel"}, {"content": "A Practical Approach to Web Service Discovery and Retrieval\nOne of the fundamental pillars of the web service vision is a brokerage system that enables services to be published to a searchable repository and later retrieved by potential users. This is the basic motivation for the UDDI standard, one of the three standards underpinning current web service technology. However, this aspect of the technology has been the least successful, and the few web sites that today attempt to provide a web service brokerage facility do so using a simple cataloguing approach rather than UDDI. In this paper we analyze why the brokerage aspect of the web service vision has proven so difficult to realize in practice and outline the technical difficulties involved in setting up and maintaining useful repositories of web services. We then describe a pragmatic approach to web service brokerage based on automated indexing and discuss the required technological foundations. We also suggest some ideas for improving the existing standards to better support this approach and web service searching in general.", "file_path": "./data/paper/Hummel/04279605.pdf", "title": "A Practical Approach to Web Service Discovery and Retrieval", "abstract": "One of the fundamental pillars of the web service vision is a brokerage system that enables services to be published to a searchable repository and later retrieved by potential users. This is the basic motivation for the UDDI standard, one of the three standards underpinning current web service technology. However, this aspect of the technology has been the least successful, and the few web sites that today attempt to provide a web service brokerage facility do so using a simple cataloguing approach rather than UDDI. In this paper we analyze why the brokerage aspect of the web service vision has proven so difficult to realize in practice and outline the technical difficulties involved in setting up and maintaining useful repositories of web services. We then describe a pragmatic approach to web service brokerage based on automated indexing and discuss the required technological foundations. We also suggest some ideas for improving the existing standards to better support this approach and web service searching in general.", "keywords": [], "author": "Hummel"}, {"content": "Strategies for the Run-Time Testing of Third Party Web Services\nBecause of the dynamic way in which serviceoriented architectures are configured, the correct interaction of service users and service providers can only be fully tested at r un-time. However, the run-time testing of web services is complicated by the fact that they may be arbitrarily shared and may have lifetimes which are independent of the applications that use them. In this paper we investigate this situation by first identifying the different types of tests that can be applied to services at run-time and the different types of web services that can be used in service-oriented systems. We then discuss how these can be combinedidentifying the combinations of tests and web services that make sense and those that do not. The resulting analysis identifies six distinct forms of run-time testing strategy of practical value in service-oriented systems.", "file_path": "./data/paper/Hummel/Strategies_for_the_Run-Time_Testing_of_Third_Party_Web_Services.pdf", "title": "Strategies for the Run-Time Testing of Third Party Web Services", "abstract": "Because of the dynamic way in which serviceoriented architectures are configured, the correct interaction of service users and service providers can only be fully tested at r un-time. However, the run-time testing of web services is complicated by the fact that they may be arbitrarily shared and may have lifetimes which are independent of the applications that use them. In this paper we investigate this situation by first identifying the different types of tests that can be applied to services at run-time and the different types of web services that can be used in service-oriented systems. We then discuss how these can be combinedidentifying the combinations of tests and web services that make sense and those that do not. The resulting analysis identifies six distinct forms of run-time testing strategy of practical value in service-oriented systems.", "keywords": [], "author": "Hummel"}, {"content": "Evaluating the Efficiency of Retrieval Methods for Component Repositories\nComponent-based software reuse has long been seen as a means of improving the efficiency of software development projects and the resulting quality of software systems. However, in practice it has proven difficult to set up and maintain viable software repositories and provide effective mechanisms for retrieving components and services from them. Although the literature contains a comprehensive collection of retrieval methods, to date there have been few evaluations of their relative efficiency. Moreover, those that are available only study small repositories of about a few hundred components. Since today's internet-based repositories are many orders of magnitude larger they require much higher search precision to deliver usable results. In this paper we present an evaluation of well known component retrieval techniques in the context of modern component repositories available on the World Wide Web.", "file_path": "./data/paper/Hummel/Evaluation the Efficeniency of Retrieval Methods.pdf", "title": "Evaluating the Efficiency of Retrieval Methods for Component Repositories", "abstract": "Component-based software reuse has long been seen as a means of improving the efficiency of software development projects and the resulting quality of software systems. However, in practice it has proven difficult to set up and maintain viable software repositories and provide effective mechanisms for retrieving components and services from them. Although the literature contains a comprehensive collection of retrieval methods, to date there have been few evaluations of their relative efficiency. Moreover, those that are available only study small repositories of about a few hundred components. Since today's internet-based repositories are many orders of magnitude larger they require much higher search precision to deliver usable results. In this paper we present an evaluation of well known component retrieval techniques in the context of modern component repositories available on the World Wide Web.", "keywords": [], "author": "Hummel"}, {"content": "Using the Web as a Reuse Repository\nSoftware reuse is widely recognized as an effective way of increasing the quality of software systems whilst lowering the effort and time involved in their development. Although most of the basic techniques for software retrieval have been around for a while, third party reuse is still largely a \"hit and miss\" affair and the promise of large case component marketplaces has so far failed to materialize. One of the key obstacles to systematic reuse has traditionally been the set up and maintenance of up-to-date software repositories. However, the rise of the World Wide Web as a general information repository holds the potential to solve this problem and give rise to a truly ubiquitous library of (open source) software components. This paper surveys reuse repositories on the Web and estimates the amount of software currently available in them. We also briefly discuss how this software can be harvested by means of general purpose web search engines and demonstrate the effectiveness of our implementation of this approach by applying it to reuse examples presented in earlier literature.", "file_path": "./data/paper/Hummel/LNCS 4039 - Reuse of Off-the-Shelf Components.pdf", "title": "Using the Web as a Reuse Repository", "abstract": "Software reuse is widely recognized as an effective way of increasing the quality of software systems whilst lowering the effort and time involved in their development. Although most of the basic techniques for software retrieval have been around for a while, third party reuse is still largely a \"hit and miss\" affair and the promise of large case component marketplaces has so far failed to materialize. One of the key obstacles to systematic reuse has traditionally been the set up and maintenance of up-to-date software repositories. However, the rise of the World Wide Web as a general information repository holds the potential to solve this problem and give rise to a truly ubiquitous library of (open source) software components. This paper surveys reuse repositories on the Web and estimates the amount of software currently available in them. We also briefly discuss how this software can be harvested by means of general purpose web search engines and demonstrate the effectiveness of our implementation of this approach by applying it to reuse examples presented in earlier literature.", "keywords": [], "author": "Hummel"}, {"content": "Extreme Harvesting: Test Driven Discovery and Reuse of Software Components\nThe reuse of software components is the key to improving productiviw and quality levels in sofhvare engineering. However, although the technologies for plugging together components have evolved dramaticaltv over the last few years (e.g. EJB, .NET Web Services) the technologies for actually $riding them in the first place are still relatively immature. In this paper we present a simple but eflective approach for harvesting software components from the Internet. The initial discovery of components is achieved using a standard web search engine such as Google, and the evaluation of '@ness for purpose\" is performed by automated testing. Since test-driven evaluation of sofware is the hallmark of Extreme Programming, and the appvoach naturalb complements the extreme approach to software engineering, we refer to it as \"Extreme Harvesting\". 711e paper first explains the principles behind Extreme Harvesting and then describes a prototype implementation.", "file_path": "./data/paper/Hummel/Extreme_Harvesting_test_driven_discovery_and_reuse_of_software_components.pdf", "title": "Extreme Harvesting: Test Driven Discovery and Reuse of Software Components", "abstract": "The reuse of software components is the key to improving productiviw and quality levels in sofhvare engineering. However, although the technologies for plugging together components have evolved dramaticaltv over the last few years (e.g. EJB, .NET Web Services) the technologies for actually $riding them in the first place are still relatively immature. In this paper we present a simple but eflective approach for harvesting software components from the Internet. The initial discovery of components is achieved using a standard web search engine such as Google, and the evaluation of '@ness for purpose\" is performed by automated testing. Since test-driven evaluation of sofware is the hallmark of Extreme Programming, and the appvoach naturalb complements the extreme approach to software engineering, we refer to it as \"Extreme Harvesting\". 711e paper first explains the principles behind Extreme Harvesting and then describes a prototype implementation.", "keywords": [], "author": "Hummel"}, {"content": "More Archetypal Usage Scenarios for Software Search Engines\nThe increasing availability of software in all kinds of repositories has renewed interest in software retrieval and software reuse. Not only has there been significant progress in developing various types of tools for searching for reusable artifacts, but also the integration of these tools into development environments has matured considerably. Yet, relatively little is known on why and how developers use these features and whether there are applications of the technology that go beyond classic reuse. Since we believe it is important for our fledgling community to understand how developers can benefit from software search systems, we present an initial collection of archetypal usage scenarios for them. These are derived from a survey of existing literature along with novel ideas from ongoing experiments with a state of the art software search engine.", "file_path": "./data/paper/Hummel/1809175.1809181.pdf", "title": "More Archetypal Usage Scenarios for Software Search Engines", "abstract": "The increasing availability of software in all kinds of repositories has renewed interest in software retrieval and software reuse. Not only has there been significant progress in developing various types of tools for searching for reusable artifacts, but also the integration of these tools into development environments has matured considerably. Yet, relatively little is known on why and how developers use these features and whether there are applications of the technology that go beyond classic reuse. Since we believe it is important for our fledgling community to understand how developers can benefit from software search systems, we present an initial collection of archetypal usage scenarios for them. These are derived from a survey of existing literature along with novel ideas from ongoing experiments with a state of the art software search engine.", "keywords": ["D.2.13 [Software]: Reusable Software-Reuse Models Software Search Engines", "Life Cycle", "Reuse", "Testing Software", "Search Engines", "Test-Driven Reuse", "Discrepancy-Driven Testing"], "author": "Hummel"}, {"content": "An Unabridged Source Code Dataset for Research in Software Reuse\nThis paper describes a large, unabridged data-set of Java source code gathered and shared as part of the Merobase Component Finder project of the Software-Engineering Group at the University of Mannheim. It consists of the complete index used to drive the search engine, www.merobase.com, the vast majority 1 of the source code modules accessible through it, and a tool that enables researchers to efficiently browse the collected data. We describe the techniques used to collect, format and store the data set, as well as the core capabilities of the Merobase search engine such as classic keyword-based, interface-based and test-driven search. This data-set, which represents", "file_path": "./data/paper/Hummel/msr13-id43-p-15930-preprint.pdf", "title": "An Unabridged Source Code Dataset for Research in Software Reuse", "abstract": "This paper describes a large, unabridged data-set of Java source code gathered and shared as part of the Merobase Component Finder project of the Software-Engineering Group at the University of Mannheim. It consists of the complete index used to drive the search engine, www.merobase.com, the vast majority 1 of the source code modules accessible through it, and a tool that enables researchers to efficiently browse the collected data. We describe the techniques used to collect, format and store the data set, as well as the core capabilities of the Merobase search engine such as classic keyword-based, interface-based and test-driven search. This data-set, which represents", "keywords": [], "author": "Hummel"}, {"content": "Reuse-Oriented Code Recommendation Systems\nEffective software reuse has long been regarded as an important foundation for a more engineering-like approach to software development. Proactive recommendation systems that have the ability to unobtrusively suggest immediately applicable reuse opportunities can become a crucial step toward realizing this goal and making reuse more practical. This chapter focuses on tools that support reuse through the recommendation of source code-reuse-oriented code recommendation systems (ROCR). These support a large variety of common code reuse approaches from the copy-and-paste metaphor to other techniques such as automatically generating code using the knowledge gained by mining source code repositories. In this chapter, we discuss the foundations of software search and reuse, provide an overview of the main characteristics of ROCR systems, and describe how they can be built.", "file_path": "./data/paper/Hummel/978-3-642-45135-5 (1).pdf", "title": "Reuse-Oriented Code Recommendation Systems", "abstract": "Effective software reuse has long been regarded as an important foundation for a more engineering-like approach to software development. Proactive recommendation systems that have the ability to unobtrusively suggest immediately applicable reuse opportunities can become a crucial step toward realizing this goal and making reuse more practical. This chapter focuses on tools that support reuse through the recommendation of source code-reuse-oriented code recommendation systems (ROCR). These support a large variety of common code reuse approaches from the copy-and-paste metaphor to other techniques such as automatically generating code using the knowledge gained by mining source code repositories. In this chapter, we discuss the foundations of software search and reuse, provide an overview of the main characteristics of ROCR systems, and describe how they can be built.", "keywords": [], "author": "Hummel"}, {"content": "Facilitating the Comparison of Software Retrieval Systems through a Reference Reuse Collection\nAlthough the idea of component-based software reuse has been around for more than four decades the technology for retrieving reusable software artefacts has grown out of its infancy only recently. After about 30 years of basic research in which scientists struggled to get their hands on meaningful numbers of reusable artifacts to evaluate their prototypes, the \"open source revolution\" has made software reuse a serious practical possibility. Millions of reusable files have become freely available and more sophisticated retrieval tools have emerged providing better ways of searching among them. However, while the development of such systems has made considerable progress, their evaluation is still largely driven by proprietary approaches which are all too often neither comprehensive nor comparable to one another. Hence, in this position paper, we propose the compilation of a reference collection of reusable artifacts in order to facilitate the future evaluation and comparison of software retrieval tools.", "file_path": "./data/paper/Hummel/1809175.1809180.pdf", "title": "Facilitating the Comparison of Software Retrieval Systems through a Reference Reuse Collection", "abstract": "Although the idea of component-based software reuse has been around for more than four decades the technology for retrieving reusable software artefacts has grown out of its infancy only recently. After about 30 years of basic research in which scientists struggled to get their hands on meaningful numbers of reusable artifacts to evaluate their prototypes, the \"open source revolution\" has made software reuse a serious practical possibility. Millions of reusable files have become freely available and more sophisticated retrieval tools have emerged providing better ways of searching among them. However, while the development of such systems has made considerable progress, their evaluation is still largely driven by proprietary approaches which are all too often neither comprehensive nor comparable to one another. Hence, in this position paper, we propose the compilation of a reference collection of reusable artifacts in order to facilitate the future evaluation and comparison of software retrieval tools.", "keywords": ["H.3.7 [Information Storage and Retrieval]: Digital Librariesstandards Measurement", "Standardization Component-based software development", "information retrieval", "reference reuse collection"], "author": "Hummel"}, {"content": "Acquisition of practical skills in the protected learning space of a scientific community\nDigitalization is constantly forcing companies to refine their products, services and business models. To shape this change, companies expect not only knowledge of the technologies from the graduates of the respective study programs, but also comprehensive methodological and social competences. For this purpose, we describe the concept of a project semester in an Enterprise Computing study programme, which imparts the required skills. The task set by a partner in the industry allows the achievement of different goals and integrates the various dimensions. On this basis, we describe the best practices in the areas of project management, knowledge building, administration and dealing with customers and other stakeholders. The description of actually carried out projects shows the application of our concept and allows the reader to transfer the best practices to his own needs. Finally, we point out the advantages for the project participants and outline expansion potential.", "file_path": "./data/paper/Dietrich/2018_p63-groschel.pdf", "title": "Acquisition of practical skills in the protected learning space of a scientific community", "abstract": "Digitalization is constantly forcing companies to refine their products, services and business models. To shape this change, companies expect not only knowledge of the technologies from the graduates of the respective study programs, but also comprehensive methodological and social competences. For this purpose, we describe the concept of a project semester in an Enterprise Computing study programme, which imparts the required skills. The task set by a partner in the industry allows the achievement of different goals and integrates the various dimensions. On this basis, we describe the best practices in the areas of project management, knowledge building, administration and dealing with customers and other stakeholders. The description of actually carried out projects shows the application of our concept and allows the reader to transfer the best practices to his own needs. Finally, we point out the advantages for the project participants and outline expansion potential.", "keywords": ["Applied computing \u2192 Education; teaching", "project management", "education", "students competences", "best practices"], "author": "Dietrich"}, {"content": "PARTICLE RADIATION EFFECTS ON AND CALIBRATION OF SPACE INFRARED DETECTORS\nThe GGnfrared detectors of ESA's IS0 satellite are sensitive enough to allow measurements at the limits imposed by the natural sky background. However, high-energy protons and electrons of the earth's radiation belts induce spikes, higher dark current and detector noise as well as an increased level of responsivity. These effects cause signal drift& for hours after the belt passage, resulting in a temporary loss of the detectors' photometric calibration. The passage of the ISOPHOT detectors through the radiation belts has therefore been simulated in the laboratory and effective curing methods searched for to restore the detectors' photometric calibration. With a combination of bright &flashes of the ISOPHOT onboard calibration source and a bias boost the detectors can now be reset to within a few percent of the pre-irmdiation chamcteristics.", "file_path": "./data/paper/Dietrich/1-s2.0-0273117796000191-main.pdf", "title": "PARTICLE RADIATION EFFECTS ON AND CALIBRATION OF SPACE INFRARED DETECTORS", "abstract": "The GGnfrared detectors of ESA's IS0 satellite are sensitive enough to allow measurements at the limits imposed by the natural sky background. However, high-energy protons and electrons of the earth's radiation belts induce spikes, higher dark current and detector noise as well as an increased level of responsivity. These effects cause signal drift& for hours after the belt passage, resulting in a temporary loss of the detectors' photometric calibration. The passage of the ISOPHOT detectors through the radiation belts has therefore been simulated in the laboratory and effective curing methods searched for to restore the detectors' photometric calibration. With a combination of bright &flashes of the ISOPHOT onboard calibration source and a bias boost the detectors can now be reset to within a few percent of the pre-irmdiation chamcteristics.", "keywords": ["b&red Space Observatory (ISO)", "ISOPHOT", "infrared detectors", "high-energy radiation effects", "curing", "IR-flash", "bias boost"], "author": "Dietrich"}, {"content": "Design of a Realtime Industrial-Ethernet Network Including Hot-Pluggable Asynchronous Devices\nThis paper presents a new approach to design a realtime Industrial-Ethernet network with the feature to add asynchronous sending devices like Laptops or Personal Computers to the network without affecting the realtime performance. The idea is based on the modification of switching engines with usage of standard cabling and network interface cards. Although the implementation is similar to Siemens ProfiNet IRT, we will point out advantages of our more general approach. As a research result in formal modelling, we present a way to generate bipartite conflict graphs out of a given network infrastructure and communication requests of the devices. This graphs can be colored in polynomial time, which leads to schedules for each switch. Additionally, we discuss possible hardware designs of a modified Ethernet switch. I.", "file_path": "./data/paper/Dopatka/Design_of_a_Realtime_Industrial-Ethernet_Network_Including_Hot-Pluggable_Asynchronous_Devices.pdf", "title": "Design of a Realtime Industrial-Ethernet Network Including Hot-Pluggable Asynchronous Devices", "abstract": "This paper presents a new approach to design a realtime Industrial-Ethernet network with the feature to add asynchronous sending devices like Laptops or Personal Computers to the network without affecting the realtime performance. The idea is based on the modification of switching engines with usage of standard cabling and network interface cards. Although the implementation is similar to Siemens ProfiNet IRT, we will point out advantages of our more general approach. As a research result in formal modelling, we present a way to generate bipartite conflict graphs out of a given network infrastructure and communication requests of the devices. This graphs can be colored in polynomial time, which leads to schedules for each switch. Additionally, we discuss possible hardware designs of a modified Ethernet switch. I.", "keywords": [], "author": "Dopatka"}, {"content": "The practical clinical value of three-dimensional models of complex congenitally malformed hearts\nObjective: Detailed 3-dimensional anatomic information is essential when planning strategies of surgical treatment for patients with complex congenitally malformed hearts. Current imaging techniques, however, do not always provide all the necessary anatomic information in a user-friendly fashion. We sought to assess the practical clinical value of realistic 3-dimensional models of complex congenitally malformed hearts. Methods: In 11 patients, aged from 0.8 to 27 years, all with complex congenitally malformed hearts, an unequivocal decision regarding the optimum surgical strategy had not been reached when using standard diagnostic tools. Therefore, we constructed 3-dimensional virtual computer and printed cast models of the heart on the basis of high-resolution whole-heart or cine magnetic resonance imaging or computed tomography. Anatomic descriptions were compared with intraoperative findings when surgery was performed. Results: Independently of age-related factors, images acquired in all patients using magnetic resonance imaging and computed tomography proved to be of sufficient quality for producing the models without major differences in the postprocessing and revealing the anatomy in an unequivocal 3-dimensional context. Examination of the models provided invaluable additional information that supported the surgical decision-making. The anatomy as shown in the models was confirmed during surgery. Biventricular corrective surgery was achieved in 5 patients, palliative surgery was achieved in 3 patients, and lack of suitable surgical options was confirmed in the remaining 3 patients. Conclusion: Realistic 3-dimensional modeling of the heart provides a new means for the assessment of complex intracardiac anatomy. We expect this method to change current diagnostic approaches and facilitate preoperative planning.", "file_path": "./data/paper/Wolf/1-s2.0-S0022522309004127-main.pdf", "title": "The practical clinical value of three-dimensional models of complex congenitally malformed hearts", "abstract": "Objective: Detailed 3-dimensional anatomic information is essential when planning strategies of surgical treatment for patients with complex congenitally malformed hearts. Current imaging techniques, however, do not always provide all the necessary anatomic information in a user-friendly fashion. We sought to assess the practical clinical value of realistic 3-dimensional models of complex congenitally malformed hearts. Methods: In 11 patients, aged from 0.8 to 27 years, all with complex congenitally malformed hearts, an unequivocal decision regarding the optimum surgical strategy had not been reached when using standard diagnostic tools. Therefore, we constructed 3-dimensional virtual computer and printed cast models of the heart on the basis of high-resolution whole-heart or cine magnetic resonance imaging or computed tomography. Anatomic descriptions were compared with intraoperative findings when surgery was performed. Results: Independently of age-related factors, images acquired in all patients using magnetic resonance imaging and computed tomography proved to be of sufficient quality for producing the models without major differences in the postprocessing and revealing the anatomy in an unequivocal 3-dimensional context. Examination of the models provided invaluable additional information that supported the surgical decision-making. The anatomy as shown in the models was confirmed during surgery. Biventricular corrective surgery was achieved in 5 patients, palliative surgery was achieved in 3 patients, and lack of suitable surgical options was confirmed in the remaining 3 patients. Conclusion: Realistic 3-dimensional modeling of the heart provides a new means for the assessment of complex intracardiac anatomy. We expect this method to change current diagnostic approaches and facilitate preoperative planning.", "keywords": [], "author": "Wolf"}, {"content": "Deep Learning Techniques for Automatic MRI Cardiac Multi-structures Segmentation and Diagnosis: Is the Problem Solved?\nDelineation of the left ventricular cavity, myocardium and right ventricle from cardiac magnetic resonance images (multi-slice 2D cine MRI) is a common clinical task to establish diagnosis. The automation of the corresponding tasks has thus been the subject of intense research over the past decades. In this paper, we introduce the \"Automatic Cardiac Diagnosis Challenge\" dataset (ACDC), the largest publicly-available and fully-annotated dataset for the purpose of Cardiac MRI (CMR) assessment. The dataset contains data from 150 multi-equipments CMRI recordings with reference measurements and classification", "file_path": "./data/paper/Wolf/Bernard et al_Deep learning techniques automatic MRI_2018_Accepted.pdf", "title": "Deep Learning Techniques for Automatic MRI Cardiac Multi-structures Segmentation and Diagnosis: Is the Problem Solved?", "abstract": "Delineation of the left ventricular cavity, myocardium and right ventricle from cardiac magnetic resonance images (multi-slice 2D cine MRI) is a common clinical task to establish diagnosis. The automation of the corresponding tasks has thus been the subject of intense research over the past decades. In this paper, we introduce the \"Automatic Cardiac Diagnosis Challenge\" dataset (ACDC), the largest publicly-available and fully-annotated dataset for the purpose of Cardiac MRI (CMR) assessment. The dataset contains data from 150 multi-equipments CMRI recordings with reference measurements and classification", "keywords": [], "author": "Wolf"}, {"content": "A Statistical Deformable Model for the Segmentation of Liver CT Volumes\nWe present a fully automated method based on an evolutionary algorithm, a statistical shape model (SSM), and a deformable mesh to tackle the liver segmentation task of the MICCAI Grand Challenge workshop. To model the expected shape and appearance, the SSM is trained on the 20 provided training datasets. Segmentation is started by a global search with the evolutionary algorithm, which provides the initial parameters for the SSM. Subsequently, a local search similar to the Active Shape method is used to refine the detected parameters. The resulting model is used to initialize the main component of our approach: a deformable mesh that strives for an equilibrium between internal and external forces. The internal forces describe the deviation of the mesh from the underlying SSM, while the external forces model the fit to the image data. To constrain the allowed deformation, we employ a graphbased optimal surface detection during calculation of the external forces. Applied to the ten test datasets of the workshop, our method delivers comparable results to the human second rater in six cases and scores an average of 59 points.", "file_path": "./data/paper/Wolf/document.pdf", "title": "A Statistical Deformable Model for the Segmentation of Liver CT Volumes", "abstract": "We present a fully automated method based on an evolutionary algorithm, a statistical shape model (SSM), and a deformable mesh to tackle the liver segmentation task of the MICCAI Grand Challenge workshop. To model the expected shape and appearance, the SSM is trained on the 20 provided training datasets. Segmentation is started by a global search with the evolutionary algorithm, which provides the initial parameters for the SSM. Subsequently, a local search similar to the Active Shape method is used to refine the detected parameters. The resulting model is used to initialize the main component of our approach: a deformable mesh that strives for an equilibrium between internal and external forces. The internal forces describe the deviation of the mesh from the underlying SSM, while the external forces model the fit to the image data. To constrain the allowed deformation, we employ a graphbased optimal surface detection during calculation of the external forces. Applied to the ten test datasets of the workshop, our method delivers comparable results to the human second rater in six cases and scores an average of 59 points.", "keywords": [], "author": "Wolf"}, {"content": "The Medical Imaging Interaction Toolkit: challenges and advances 10 years of open-source development\nPurpose The Medical Imaging Interaction Toolkit (MITK) has been available as open-source software for almost 10 years now. In this period the requirements of software systems in the medical image processing domain have become increasingly complex. The aim of this paper is to show how MITK evolved into a software system that is able to cover all steps of a clinical workflow including data retrieval, image analysis, diagnosis, treatment planning, intervention support, and treatment control. Methods MITK provides modularization and extensibility on different levels. In addition to the original toolkit, a module system, micro services for small, system-wide features, a service-oriented architecture based on the Open Services Gateway initiative (OSGi) standard, and an extensible and configurable application framework allow MITK to be used, extended and deployed as needed. A refined software process was implemented to deliver high-quality software, ease the fulfillment of regulatory requirements, and enable teamwork in mixed-competence teams.", "file_path": "./data/paper/Wolf/s11548-013-0840-8.pdf", "title": "The Medical Imaging Interaction Toolkit: challenges and advances 10 years of open-source development", "abstract": "Purpose The Medical Imaging Interaction Toolkit (MITK) has been available as open-source software for almost 10 years now. In this period the requirements of software systems in the medical image processing domain have become increasingly complex. The aim of this paper is to show how MITK evolved into a software system that is able to cover all steps of a clinical workflow including data retrieval, image analysis, diagnosis, treatment planning, intervention support, and treatment control. Methods MITK provides modularization and extensibility on different levels. In addition to the original toolkit, a module system, micro services for small, system-wide features, a service-oriented architecture based on the Open Services Gateway initiative (OSGi) standard, and an extensible and configurable application framework allow MITK to be used, extended and deployed as needed. A refined software process was implemented to deliver high-quality software, ease the fulfillment of regulatory requirements, and enable teamwork in mixed-competence teams.", "keywords": ["Open-source", "Medical image analysis", "Platform", "Extensible", "Service-oriented architecture", "Software process", "Quality management", "Image-guided therapy"], "author": "Wolf"}, {"content": "Interactive segmentation framework of the Medical Imaging Interaction Toolkit\nInteractive methods are indispensable for real world applications of segmentation in medicine, at least to allow for convenient and fast verification and correction of automated techniques. Besides traditional interactive tasks such as adding or removing parts of a segmentation, adjustment of contours or the placement of seed points, the relatively recent Graph Cut and Random Walker segmentation methods demonstrate an interest in advanced interactive strategies for segmentation. Though the value of toolkits and extensible applications is generally accepted for the development of new segmentation algorithms, the topic of interactive segmentation applications is rarely addressed by current toolkits and applications. In this paper, we present the extension of the Medical Imaging Interaction Toolkit (MITK) with a framework for the development of interactive applications for image segmentation. The framework provides a clear structure for the development of new applications and offers a plugin mechanism to easily extend existing applications with additional segmentation tools. In addition, the framework supports shape-based interpolation and multi-level undo/redo of modifications to binary images. To demonstrate the value of the framework, we also present a free, open-source application named InteractiveSegmentation for manual segmentation of medical images (including 3D+t), which is built based on the extended MITK framework. The application includes several features to effectively support manual segmentation, which are not found in comparable freely available applications. InteractiveSegmentation is fully developed and successfully and regularly used in several projects. Using the plugin mechanism, the application enables developers of new algorithms to begin algorithmic work more quickly.", "file_path": "./data/paper/Wolf/Interactive_segmentation_framework_of_th.pdf", "title": "Interactive segmentation framework of the Medical Imaging Interaction Toolkit", "abstract": "Interactive methods are indispensable for real world applications of segmentation in medicine, at least to allow for convenient and fast verification and correction of automated techniques. Besides traditional interactive tasks such as adding or removing parts of a segmentation, adjustment of contours or the placement of seed points, the relatively recent Graph Cut and Random Walker segmentation methods demonstrate an interest in advanced interactive strategies for segmentation. Though the value of toolkits and extensible applications is generally accepted for the development of new segmentation algorithms, the topic of interactive segmentation applications is rarely addressed by current toolkits and applications. In this paper, we present the extension of the Medical Imaging Interaction Toolkit (MITK) with a framework for the development of interactive applications for image segmentation. The framework provides a clear structure for the development of new applications and offers a plugin mechanism to easily extend existing applications with additional segmentation tools. In addition, the framework supports shape-based interpolation and multi-level undo/redo of modifications to binary images. To demonstrate the value of the framework, we also present a free, open-source application named InteractiveSegmentation for manual segmentation of medical images (including 3D+t), which is built based on the extended MITK framework. The application includes several features to effectively support manual segmentation, which are not found in comparable freely available applications. InteractiveSegmentation is fully developed and successfully and regularly used in several projects. Using the plugin mechanism, the application enables developers of new algorithms to begin algorithmic work more quickly.", "keywords": [], "author": "Wolf"}, {"content": "Comparison of Four Freely Available Frameworks for Image Processing and Visualization That Use ITK\nMost image processing and visualization applications allow users to configure computation parameters and manipulate the resulting visualizations. SCIRun, VolView, MeVisLab, and the Medical Interaction Toolkit (MITK) are four image processing and visualization frameworks that were built for these purposes. All frameworks are freely available and all allow the use of the ITK C++ library. In this paper, the benefits and limitations of each visualization framework are presented to aid both application developers and users in the decision of which framework may be best to use for their application. The analysis is based on more than 50 evaluation criteria, functionalities, and example applications. We report implementation times for various steps in the creation of a reference application in each of the compared frameworks. The data-flow programming frameworks, SCIRun and MeVisLab, were determined to be best for developing application prototypes, while VolView was advantageous for nonautomatic end-user applications based on existing ITK functionalities, and MITK was preferable for automated end-user applications that might include new ITK classes specifically designed for the application.", "file_path": "./data/paper/Wolf/Comparison_of_Four_Freely_Available_Frameworks_for_Image_Processing_and_Visualization_That_Use_ITK.pdf", "title": "Comparison of Four Freely Available Frameworks for Image Processing and Visualization That Use ITK", "abstract": "Most image processing and visualization applications allow users to configure computation parameters and manipulate the resulting visualizations. SCIRun, VolView, MeVisLab, and the Medical Interaction Toolkit (MITK) are four image processing and visualization frameworks that were built for these purposes. All frameworks are freely available and all allow the use of the ITK C++ library. In this paper, the benefits and limitations of each visualization framework are presented to aid both application developers and users in the decision of which framework may be best to use for their application. The analysis is based on more than 50 evaluation criteria, functionalities, and example applications. We report implementation times for various steps in the creation of a reference application in each of the compared frameworks. The data-flow programming frameworks, SCIRun and MeVisLab, were determined to be best for developing application prototypes, while VolView was advantageous for nonautomatic end-user applications based on existing ITK functionalities, and MITK was preferable for automated end-user applications that might include new ITK classes specifically designed for the application.", "keywords": ["Visualization framework", "image processing", "user interface", "comparison", "evaluation"], "author": "Wolf"}, {"content": "Intraoperative assessment of right ventricular volume and function *\nObjective: Right ventricular function is an important aspect of global cardiac performance which affects patients' outcome after cardiac surgery. Due to its geometrical complexity, the assessment of right ventricular function is still a very difficult task. Aim of this study was to investigate the value of a new technique for intraoperative assessment of right ventricle based on transesophageal 3D-echocardiography, and to compare it to volumetric thermodilution by using a new generation of fast response thermistor pulmonary artery catheters. Methods: Twentyfive patients with coronary artery disease underwent 68 intraoperative measurements by 3D-echocardiography and thermodilution simultaneously. Following parameters were analysed: right ventricular end-diastolic volume (RVEDV), end-systolic volume (RVESV) and ejection fraction (RVEF). Pulmonary, systemic and central venous pressures were simultaneously recorded. Segmentation of right ventricular volumes were obtained by the 'Coons-Patches' technique, which was implemented into the EchoAnalyzer w , a multitask system developed at our institution for three-dimensional functional and structural measurements. Results: Right ventricular volumes obtained by 3D-echocardiography did not show significant correlations to those obtained by thermodilution. Volumetric thermodilution systematically overestimates right ventricular volumes. Significant correlations were found between RVEF measured by 3D-echocardiography and those obtained by thermodilution (rZ0. 93; yZ0.2C0.80x; SEEZ0.03; P!0.01). Bland-Altmann analysis showed that thermodilution systematically underestimates RVEF. The bias for measuring RVEF was C15.6% with a precision of G4.3%. The patients were divided into two groups according to left ventricular function. The group of patients with impaired function showed significantly lower right ventricular ejection fraction (44.1G4.6 vs. 55.1G3.9%; P!0.01). Conclusions: Three-dimensional echocardiography provides a useful non-invasive tool for intraoperative and serial assessment of right ventricular function. This new technique, which overcomes the limitations of previous methods, may offer key insights into management and outcome of patients with severe impairment of cardiac function.", "file_path": "./data/paper/Wolf/27-6-988.pdf", "title": "Intraoperative assessment of right ventricular volume and function *", "abstract": "Objective: Right ventricular function is an important aspect of global cardiac performance which affects patients' outcome after cardiac surgery. Due to its geometrical complexity, the assessment of right ventricular function is still a very difficult task. Aim of this study was to investigate the value of a new technique for intraoperative assessment of right ventricle based on transesophageal 3D-echocardiography, and to compare it to volumetric thermodilution by using a new generation of fast response thermistor pulmonary artery catheters. Methods: Twentyfive patients with coronary artery disease underwent 68 intraoperative measurements by 3D-echocardiography and thermodilution simultaneously. Following parameters were analysed: right ventricular end-diastolic volume (RVEDV), end-systolic volume (RVESV) and ejection fraction (RVEF). Pulmonary, systemic and central venous pressures were simultaneously recorded. Segmentation of right ventricular volumes were obtained by the 'Coons-Patches' technique, which was implemented into the EchoAnalyzer w , a multitask system developed at our institution for three-dimensional functional and structural measurements. Results: Right ventricular volumes obtained by 3D-echocardiography did not show significant correlations to those obtained by thermodilution. Volumetric thermodilution systematically overestimates right ventricular volumes. Significant correlations were found between RVEF measured by 3D-echocardiography and those obtained by thermodilution (rZ0. 93; yZ0.2C0.80x; SEEZ0.03; P!0.01). Bland-Altmann analysis showed that thermodilution systematically underestimates RVEF. The bias for measuring RVEF was C15.6% with a precision of G4.3%. The patients were divided into two groups according to left ventricular function. The group of patients with impaired function showed significantly lower right ventricular ejection fraction (44.1G4.6 vs. 55.1G3.9%; P!0.01). Conclusions: Three-dimensional echocardiography provides a useful non-invasive tool for intraoperative and serial assessment of right ventricular function. This new technique, which overcomes the limitations of previous methods, may offer key insights into management and outcome of patients with severe impairment of cardiac function.", "keywords": ["Right ventricle", "Ejection fraction", "Monitoring", "Transesophageal 3D echocardiography", "Pulmonary artery catheter", "Thermodilution", "Cardiac surgery"], "author": "Wolf"}, {"content": "The Medical Imaging Interaction Toolkit\nThoroughly designed, open-source toolkits emerge to boost progress in medical imaging. The Insight Toolkit (ITK) provides this for the algorithmic scope of medical imaging, especially for segmentation and registration. But medical imaging algorithms have to be clinically applied to be useful, which additionally requires visualization and interaction. The Visualization Toolkit (VTK) has powerful visualization capabilities, but only low-level support for interaction. In this paper, we present the Medical Imaging Interaction Toolkit (MITK). The goal of MITK is to significantly reduce the effort required to construct specifically tailored, interactive applications for medical image analysis. MITK allows an easy combination of algorithms developed by ITK with visualizations created by VTK and extends these two toolkits with those features, which are outside the scope of both. MITK adds support for complex interactions with multiple states as well as undo-capabilities, a very important prerequisite for convenient user interfaces. Furthermore, MITK facilitates the realization of multiple, different views of the same data (as a multiplanar reconstruction and a 3D rendering) and supports the visualization of 3D+t data, whereas VTK is only designed to create one kind of view of 2D or 3D data. MITK reuses virtually everything from ITK and VTK. Thus, it is not at all a competitor to ITK or VTK, but an extension, which eases the combination of both and adds the features required for interactive, convenient to use medical imaging software. MITK is an open-source project (www.mitk.org).", "file_path": "./data/paper/Wolf/The_Medical_Imaging_Interaction_Toolkit (1).pdf", "title": "The Medical Imaging Interaction Toolkit", "abstract": "Thoroughly designed, open-source toolkits emerge to boost progress in medical imaging. The Insight Toolkit (ITK) provides this for the algorithmic scope of medical imaging, especially for segmentation and registration. But medical imaging algorithms have to be clinically applied to be useful, which additionally requires visualization and interaction. The Visualization Toolkit (VTK) has powerful visualization capabilities, but only low-level support for interaction. In this paper, we present the Medical Imaging Interaction Toolkit (MITK). The goal of MITK is to significantly reduce the effort required to construct specifically tailored, interactive applications for medical image analysis. MITK allows an easy combination of algorithms developed by ITK with visualizations created by VTK and extends these two toolkits with those features, which are outside the scope of both. MITK adds support for complex interactions with multiple states as well as undo-capabilities, a very important prerequisite for convenient user interfaces. Furthermore, MITK facilitates the realization of multiple, different views of the same data (as a multiplanar reconstruction and a 3D rendering) and supports the visualization of 3D+t data, whereas VTK is only designed to create one kind of view of 2D or 3D data. MITK reuses virtually everything from ITK and VTK. Thus, it is not at all a competitor to ITK or VTK, but an extension, which eases the combination of both and adds the features required for interactive, convenient to use medical imaging software. MITK is an open-source project (www.mitk.org).", "keywords": ["ITK", "Interaction", "Visualization", "Toolkit", "VTK"], "author": "Wolf"}, {"content": "Automatic Cardiac Disease Assessment on cine-MRI via Time-Series Segmentation and Domain Specific Features\nCardiac magnetic resonance imaging improves on diagnosis of cardiovascular diseases by providing images at high spatiotemporal resolution. Manual evaluation of these time-series, however, is expensive and prone to biased and non-reproducible outcomes. In this paper, we present a method that addresses named limitations by integrating segmentation and disease classification into a fully automatic processing pipeline. We use an ensemble of UNet inspired architectures for segmentation of cardiac structures such as the left and right ventricular cavity (LVC, RVC) and the left ventricular myocardium (LVM) on each time instance of the cardiac cycle. For the classification task, information is extracted from the segmented time-series in form of comprehensive features handcrafted to reflect diagnostic clinical procedures. Based on these features we train an ensemble of heavily regularized multilayer perceptrons (MLP) and a random forest classifier to predict the pathologic target class. We evaluated our method on the ACDC dataset (4 pathology groups, 1 healthy group) and achieve dice scores of 0.945 (LVC), 0.908 (RVC) and 0.905 (LVM) in a cross-validation over the training set (100 cases) and 0.950 (LVC), 0.923 (RVC) and 0.911 (LVM) on the test set (50 cases). We report a classification accuracy of 94% on a training set cross-validation and 92% on the test set. Our results underpin the potential of machine learning methods for accurate, fast and reproducible segmentation and computer-assisted diagnosis (CAD).", "file_path": "./data/paper/Wolf/1707.00587.pdf", "title": "Automatic Cardiac Disease Assessment on cine-MRI via Time-Series Segmentation and Domain Specific Features", "abstract": "Cardiac magnetic resonance imaging improves on diagnosis of cardiovascular diseases by providing images at high spatiotemporal resolution. Manual evaluation of these time-series, however, is expensive and prone to biased and non-reproducible outcomes. In this paper, we present a method that addresses named limitations by integrating segmentation and disease classification into a fully automatic processing pipeline. We use an ensemble of UNet inspired architectures for segmentation of cardiac structures such as the left and right ventricular cavity (LVC, RVC) and the left ventricular myocardium (LVM) on each time instance of the cardiac cycle. For the classification task, information is extracted from the segmented time-series in form of comprehensive features handcrafted to reflect diagnostic clinical procedures. Based on these features we train an ensemble of heavily regularized multilayer perceptrons (MLP) and a random forest classifier to predict the pathologic target class. We evaluated our method on the ACDC dataset (4 pathology groups, 1 healthy group) and achieve dice scores of 0.945 (LVC), 0.908 (RVC) and 0.905 (LVM) in a cross-validation over the training set (100 cases) and 0.950 (LVC), 0.923 (RVC) and 0.911 (LVM) on the test set (50 cases). We report a classification accuracy of 94% on a training set cross-validation and 92% on the test set. Our results underpin the potential of machine learning methods for accurate, fast and reproducible segmentation and computer-assisted diagnosis (CAD).", "keywords": ["Automated Cardiac Diagnosis Challenge", "Cardiac Magnetic Resonance Imaging", "Disease Prediction", "Deep Learning", "CNN"], "author": "Wolf"}, {"content": "The Medical Imaging Interaction Toolkit (MITK)a toolkit facilitating the creation of interactive software by extending VTK and ITK\nThe aim of the Medical Imaging Interaction Toolkit (MITK) is to facilitate the creation of clinically usable image-based software. Clinically usable software for image-guided procedures and image analysis require a high degree of interaction to verify and, if necessary, correct results from (semi-)automatic algorithms. MITK is a class library basing on and extending the Insight Toolkit (ITK) and the Visualization Toolkit (VTK). ITK provides leading-edge registration and segmentation algorithms and forms the algorithmic basis. VTK has powerful visualization capabilities, but only low-level support for interaction (like picking methods, rotation, movement and scaling of objects). MITK adds support for high level interactions with data like, for example, the interactive construction and modification of data objects. This includes concepts for interactions with multiple states as well as undo-capabilities. Furthermore, VTK is designed to create one kind of view on the data (either one 2D visualization or a 3D visualization). MITK facilitates the realization of multiple, different views on the same data (like multiple, multiplanar reconstructions and a 3D rendering). Hierarchically structured combinations of any number and type of data objects (image, surface, vessels, etc.) are possible. MITK can handle 3D+t data, which are required for several important medical applications, whereas VTK alone supports only 2D and 3D data. The benefit of MITK is that it supplements those features to ITK and VTK that are required for convenient to use, interactive and by that clinically usable image-based software, and that are outside the scope of both. MITK will be made open-source (http://www.mitk.org).", "file_path": "./data/paper/Wolf/title_The_medical_imaging_interaction_t.pdf", "title": "The Medical Imaging Interaction Toolkit (MITK)a toolkit facilitating the creation of interactive software by extending VTK and ITK", "abstract": "The aim of the Medical Imaging Interaction Toolkit (MITK) is to facilitate the creation of clinically usable image-based software. Clinically usable software for image-guided procedures and image analysis require a high degree of interaction to verify and, if necessary, correct results from (semi-)automatic algorithms. MITK is a class library basing on and extending the Insight Toolkit (ITK) and the Visualization Toolkit (VTK). ITK provides leading-edge registration and segmentation algorithms and forms the algorithmic basis. VTK has powerful visualization capabilities, but only low-level support for interaction (like picking methods, rotation, movement and scaling of objects). MITK adds support for high level interactions with data like, for example, the interactive construction and modification of data objects. This includes concepts for interactions with multiple states as well as undo-capabilities. Furthermore, VTK is designed to create one kind of view on the data (either one 2D visualization or a 3D visualization). MITK facilitates the realization of multiple, different views on the same data (like multiple, multiplanar reconstructions and a 3D rendering). Hierarchically structured combinations of any number and type of data objects (image, surface, vessels, etc.) are possible. MITK can handle 3D+t data, which are required for several important medical applications, whereas VTK alone supports only 2D and 3D data. The benefit of MITK is that it supplements those features to ITK and VTK that are required for convenient to use, interactive and by that clinically usable image-based software, and that are outside the scope of both. MITK will be made open-source (http://www.mitk.org).", "keywords": ["MITK", "ITK", "VTK", "interaction", "visualization"], "author": "Wolf"}, {"content": "Analysis of Mobile App Revenue Models Used in the Most Popular Games of the Tower Defense Genre on Google Play\nThis paper analyzes the revenue models of the most popular games of the Tower Defense genre on Google Play. A special look is taken at the quantitative distribution of the app sale model and the free model in terms of quality and download numbers. Additionally, this paper considers the qualitative implementation of the \"free\" model in the most popular games. First, the usual revenue models of mobile apps will be discussed and then the Tower Defense genre will be explained. Following that, the quantitative distribution of revenue models and an analysis of the most popular apps' respective revenue models will be addressed. The analysis also identifies and explains two modifications of established revenue models. The most popular revenue model for mobile apps in the Tower Defense genre are in-app purchases. This distinguishes the genre from many other genres and games. A wide range of Tower Defense games utilizes the revenue models app sale and free. It becomes apparent that revenue models for mobile apps must be analyzed and considered specifically for their respective sector, and that no single promising revenue model for apps exists.", "file_path": "./data/paper/Gro\u0308schel/admin,+2_Analysis+of+Mobile+App+Revenue+Models+Used+in+the+most+Popular.pdf", "title": "Analysis of Mobile App Revenue Models Used in the Most Popular Games of the Tower Defense Genre on Google Play", "abstract": "This paper analyzes the revenue models of the most popular games of the Tower Defense genre on Google Play. A special look is taken at the quantitative distribution of the app sale model and the free model in terms of quality and download numbers. Additionally, this paper considers the qualitative implementation of the \"free\" model in the most popular games. First, the usual revenue models of mobile apps will be discussed and then the Tower Defense genre will be explained. Following that, the quantitative distribution of revenue models and an analysis of the most popular apps' respective revenue models will be addressed. The analysis also identifies and explains two modifications of established revenue models. The most popular revenue model for mobile apps in the Tower Defense genre are in-app purchases. This distinguishes the genre from many other genres and games. A wide range of Tower Defense games utilizes the revenue models app sale and free. It becomes apparent that revenue models for mobile apps must be analyzed and considered specifically for their respective sector, and that no single promising revenue model for apps exists.", "keywords": ["revenue model", "business model", "mobile apps", "Google Play", "mobile games"], "author": "Gro\u0308schel"}, {"content": "Evaluation of the Development Framework Jasonette\nIn this paper we evaluate the development framework Jasonette which is based on JSON. The evaluation was carried out by means of a prototype of an app informing prospective students about the study program Enterprise Computing at the Mannheim University of Applied Sciences. We defined a set of criteria that we cross checked during and after the implementation of the app.", "file_path": "./data/paper/Gro\u0308schel/admin,+2-Evaluation+of+the+Development+Framework+Jasonette.pdf", "title": "Evaluation of the Development Framework Jasonette", "abstract": "In this paper we evaluate the development framework Jasonette which is based on JSON. The evaluation was carried out by means of a prototype of an app informing prospective students about the study program Enterprise Computing at the Mannheim University of Applied Sciences. We defined a set of criteria that we cross checked during and after the implementation of the app.", "keywords": ["Jasonette", "Development Framework", "Evaluation", "Prototype _________________________________________________________________________________"], "author": "Gro\u0308schel"}, {"content": "Acquisition of practical skills in the protected learning space of a scientific community\nDigitalization is constantly forcing companies to refine their products, services and business models. To shape this change, companies expect not only knowledge of the technologies from the graduates of the respective study programs, but also comprehensive methodological and social competences. For this purpose, we describe the concept of a project semester in an Enterprise Computing study programme, which imparts the required skills. The task set by a partner in the industry allows the achievement of different goals and integrates the various dimensions. On this basis, we describe the best practices in the areas of project management, knowledge building, administration and dealing with customers and other stakeholders. The description of actually carried out projects shows the application of our concept and allows the reader to transfer the best practices to his own needs. Finally, we point out the advantages for the project participants and outline expansion potential.", "file_path": "./data/paper/Gro\u0308schel/3209087.3209090.pdf", "title": "Acquisition of practical skills in the protected learning space of a scientific community", "abstract": "Digitalization is constantly forcing companies to refine their products, services and business models. To shape this change, companies expect not only knowledge of the technologies from the graduates of the respective study programs, but also comprehensive methodological and social competences. For this purpose, we describe the concept of a project semester in an Enterprise Computing study programme, which imparts the required skills. The task set by a partner in the industry allows the achievement of different goals and integrates the various dimensions. On this basis, we describe the best practices in the areas of project management, knowledge building, administration and dealing with customers and other stakeholders. The description of actually carried out projects shows the application of our concept and allows the reader to transfer the best practices to his own needs. Finally, we point out the advantages for the project participants and outline expansion potential.", "keywords": ["Applied computing \u2192 Education; teaching", "project management", "education", "students competences", "best practices"], "author": "Gro\u0308schel"}, {"content": "Using SOAR as a Semantic Support Component for Sensor Web Enablement\nSemantic service discovery is necessary to facilitate the potential of service providers (many sensors, different characteristics) to change the sensor configuration in a generic surveillance application without modifications to the application's business logic. To combine efficiency and flexibility, semantic annotation of sensors and semantic aware match making components are needed. This short paper gives the reader an understandig of the SOAR component for semantic SWE support and rule based sensor selection.", "file_path": "./data/paper/Leuchter/kannegieser.pdf", "title": "Using SOAR as a Semantic Support Component for Sensor Web Enablement", "abstract": "Semantic service discovery is necessary to facilitate the potential of service providers (many sensors, different characteristics) to change the sensor configuration in a generic surveillance application without modifications to the application's business logic. To combine efficiency and flexibility, semantic annotation of sensors and semantic aware match making components are needed. This short paper gives the reader an understandig of the SOAR component for semantic SWE support and rule based sensor selection.", "keywords": ["SOAR", "SCA", "rule based sensor selection", "service discovery"], "author": "Leuchter"}, {"content": "Distribution of Human-Machine Interfaces in System-of-Systems Engineering\nSystem-of-systems integration requires sharing of data, algorithms, user authorization/authentication, and user interfaces between independent systems. While SOA promises to solve the first issues the latter is still open. Within an experimental prototype for a distributed information system we have tested different methods to share not only the algorithmics and data of services but also their user interface. The experimental prototype consists of nodes providing services within process portals and nodes realizing services with software agents. Some of the services were extended with WSRP (web service remote portlet) to provide their own user interface components that can be transmitted between separated containers and application servers. Interoperability tests were conducted on JBoss and BEA Portal Workshop. Open questions remain on how the layout of one component should influence the internal layout of other GUI-components displayed concurrently. Former work on user interface management systems could improve todays tools in that respect.", "file_path": "./data/paper/Leuchter/978-3-642-02556-3_31.pdf", "title": "Distribution of Human-Machine Interfaces in System-of-Systems Engineering", "abstract": "System-of-systems integration requires sharing of data, algorithms, user authorization/authentication, and user interfaces between independent systems. While SOA promises to solve the first issues the latter is still open. Within an experimental prototype for a distributed information system we have tested different methods to share not only the algorithmics and data of services but also their user interface. The experimental prototype consists of nodes providing services within process portals and nodes realizing services with software agents. Some of the services were extended with WSRP (web service remote portlet) to provide their own user interface components that can be transmitted between separated containers and application servers. Interoperability tests were conducted on JBoss and BEA Portal Workshop. Open questions remain on how the layout of one component should influence the internal layout of other GUI-components displayed concurrently. Former work on user interface management systems could improve todays tools in that respect.", "keywords": ["HMI", "portlet", "wsrp", "web clipping", "interoperability"], "author": "Leuchter"}, {"content": "Development of Micro UAV Swarms\nSome complex application scenarios for micro UAVs (Unmanned Aerial Vehicles) call for the formation of swarms of multiple drones. In this paper a platform for the creation of such swarms is presented. It consists of modified commercial quadrocopters and a self-made ground control station software architecture. Autonomy of individual drones is generated through a micro controller equipped video camera. Currently it is possible to fly basic maneuvers autonomously, such as takeoff , fly to position, and landing. In the future the camera's image processing capabilities will be used to generate additional control information. Different cooperation strategies for teams of UAVs are currently evaluated with an agent based simulation tool. Finally complex application scenarios for multiple micro UAVs are presented.", "file_path": "./data/paper/Leuchter/ams2009.pdf", "title": "Development of Micro UAV Swarms", "abstract": "Some complex application scenarios for micro UAVs (Unmanned Aerial Vehicles) call for the formation of swarms of multiple drones. In this paper a platform for the creation of such swarms is presented. It consists of modified commercial quadrocopters and a self-made ground control station software architecture. Autonomy of individual drones is generated through a micro controller equipped video camera. Currently it is possible to fly basic maneuvers autonomously, such as takeoff , fly to position, and landing. In the future the camera's image processing capabilities will be used to generate additional control information. Different cooperation strategies for teams of UAVs are currently evaluated with an agent based simulation tool. Finally complex application scenarios for multiple micro UAVs are presented.", "keywords": [], "author": "Leuchter"}, {"content": "Co-operating Miniature UAVs for Surveillance and Reconnaissance\nSome complex application scenarios for micro UAVs call for the formation of swarms of multiple drones. In this paper a platform for the creation of such swarms is presented. It consists of modified commercial quadrocopters and a self-made ground control station software architecture. Autonomy of individual drones is generated through a micro controller equipped video camera. Currently it is possible to fly basic maneuvers autonomously, such as takeoff , fly to position, and landing. In the future the camera's image processing capabilities will be used to generate additional control information. Different cooperation strategies for teams of UAVs are currently evaluated with an agent based simulation tool. Finally complex application scenarios for multiple micro UAVs are presented.", "file_path": "./data/paper/Leuchter/001.pdf", "title": "Co-operating Miniature UAVs for Surveillance and Reconnaissance", "abstract": "Some complex application scenarios for micro UAVs call for the formation of swarms of multiple drones. In this paper a platform for the creation of such swarms is presented. It consists of modified commercial quadrocopters and a self-made ground control station software architecture. Autonomy of individual drones is generated through a micro controller equipped video camera. Currently it is possible to fly basic maneuvers autonomously, such as takeoff , fly to position, and landing. In the future the camera's image processing capabilities will be used to generate additional control information. Different cooperation strategies for teams of UAVs are currently evaluated with an agent based simulation tool. Finally complex application scenarios for multiple micro UAVs are presented.", "keywords": [], "author": "Leuchter"}, {"content": "Agent-Based Web for Information Fusion in Military Intelligence, Surveillance, and Reconnaissance\nThis paper describes a research prototype of an experimental information management system for the German Federal Armed Forces. The system is realized as an agent-based web for military intelligence, surveillance, and reconnaissance (ISR). Users can access ISR related information, services, and experts through this web. Information management is based on a semantic representation of sensor data and other ISR information. The system supports information fusion and offers personalized functionalities. This contribution reports the current state of the system, its software architecture and support functions.", "file_path": "./data/paper/Leuchter/Agent-based_web_for_information_fusion_in_military_intelligence_surveillance_and_reconnaissance.pdf", "title": "Agent-Based Web for Information Fusion in Military Intelligence, Surveillance, and Reconnaissance", "abstract": "This paper describes a research prototype of an experimental information management system for the German Federal Armed Forces. The system is realized as an agent-based web for military intelligence, surveillance, and reconnaissance (ISR). Users can access ISR related information, services, and experts through this web. Information management is based on a semantic representation of sensor data and other ISR information. The system supports information fusion and offers personalized functionalities. This contribution reports the current state of the system, its software architecture and support functions.", "keywords": ["defense", "intelligence", "surveillance", "and reconnaissance", "ISR", "information space", "information fusion", "software agents", "software architecture", "personalization"], "author": "Leuchter"}, {"content": "Usability Engineering of \"In Vehicle Information Systems\" With Multi-Tasking GOMS\nThe developments in vehicle electronics and new services are supposed to promise more convenience in driving. The offers and ideas range from vehicle-related installations, such as accident alert, petrol station assistance, dynamic navigation and travel guide, to communication and entertainment services. There is one central design problem that is essential for achieving the main objective \"safe motor vehicle driving\", i.e. the use of the new service must not unduly distract the driver. In the following, we present a newly developed procedure to calculate the interference between main and additional tasks based on user models, which can already be applied in the early phases of system design. Thus, the driving tasks are described as ideal-typical resource profiles. There is only required a formative-quantitative task analysis in order to assess the secondary task using the well known and commonly approved method GOMS with some new extensions for multi-tasking.", "file_path": "./data/paper/Leuchter/MEK2008.pdf", "title": "Usability Engineering of \"In Vehicle Information Systems\" With Multi-Tasking GOMS", "abstract": "The developments in vehicle electronics and new services are supposed to promise more convenience in driving. The offers and ideas range from vehicle-related installations, such as accident alert, petrol station assistance, dynamic navigation and travel guide, to communication and entertainment services. There is one central design problem that is essential for achieving the main objective \"safe motor vehicle driving\", i.e. the use of the new service must not unduly distract the driver. In the following, we present a newly developed procedure to calculate the interference between main and additional tasks based on user models, which can already be applied in the early phases of system design. Thus, the driving tasks are described as ideal-typical resource profiles. There is only required a formative-quantitative task analysis in order to assess the secondary task using the well known and commonly approved method GOMS with some new extensions for multi-tasking.", "keywords": [], "author": "Leuchter"}, {"content": "Collaborative Attack Mitigation and Response: A survey\nOver recent years, network-based attacks have become one of the top causes of network infrastructure and service outages. To counteract such attacks, an approach is to move mitigation from the target network to the networks of Internet Service Providers (ISP). However, it remains unclear to what extent countermeasures are set up and which mitigation approaches are adopted by ISPs. The goal of this paper is to present the results of a survey that aims to gain insight into processes, structures and capabilities of ISPs to mitigate and respond to network-based attacks.", "file_path": "./data/paper/Steinberger/im2015_Upload.pdf", "title": "Collaborative Attack Mitigation and Response: A survey", "abstract": "Over recent years, network-based attacks have become one of the top causes of network infrastructure and service outages. To counteract such attacks, an approach is to move mitigation from the target network to the networks of Internet Service Providers (ISP). However, it remains unclear to what extent countermeasures are set up and which mitigation approaches are adopted by ISPs. The goal of this paper is to present the results of a survey that aims to gain insight into processes, structures and capabilities of ISPs to mitigate and respond to network-based attacks.", "keywords": [], "author": "Steinberger"}, {"content": "IoT-Botnet Detection and Isolation by Access Routers\nIn recent years, emerging technologies such as the Internet of Things gain increasing interest in various communities. However, the majority of IoT devices have little or no protection at software and infrastructure levels and thus are also opening up new vulnerabilities that might be misused by cybercriminals to perform large-scale cyber attacks by means of IoT botnets. These kind of attacks lead to infrastructure and service outages and cause enormous financial loss, image and reputation damage. One approach to proactively block the spreading of such IoT botnets is to automatically scan for vulnerable IoT devices and isolate them from the Internet before they are compromised and also become part of the IoT botnet. The goal of this paper is to present an IoT botnet detection and isolation approach at the level of access routers that makes IoT devices more attack resilient. We show that our IoT botnet detection and isolation approach helps to prevent the compromise of IoT devices without the need to have in-depth technical administration knowledge, and hence make it viable for customers and end users.", "file_path": "./data/paper/Steinberger/iot_botnet.pdf", "title": "IoT-Botnet Detection and Isolation by Access Routers", "abstract": "In recent years, emerging technologies such as the Internet of Things gain increasing interest in various communities. However, the majority of IoT devices have little or no protection at software and infrastructure levels and thus are also opening up new vulnerabilities that might be misused by cybercriminals to perform large-scale cyber attacks by means of IoT botnets. These kind of attacks lead to infrastructure and service outages and cause enormous financial loss, image and reputation damage. One approach to proactively block the spreading of such IoT botnets is to automatically scan for vulnerable IoT devices and isolate them from the Internet before they are compromised and also become part of the IoT botnet. The goal of this paper is to present an IoT botnet detection and isolation approach at the level of access routers that makes IoT devices more attack resilient. We show that our IoT botnet detection and isolation approach helps to prevent the compromise of IoT devices without the need to have in-depth technical administration knowledge, and hence make it viable for customers and end users.", "keywords": [], "author": "Steinberger"}, {"content": "Driver identification using in-vehicle digital data in the forensic context of a hit and run accident\nOne major focus in forensics is the identification of individuals based on different kinds of evidence found at a crime scene and in the digital domain. In the present study, we assessed the potential of using in-vehicle digital data to capture the natural driving behavior of individuals in order to identify them. Freely available data was used to classify drivers by their natural driving behavior. We formulated a forensic scenario of a hit and run car accident with three known suspects. Suggestions are provided for an understandable and useful reporting of model results in the light of the requirements in digital forensics. Specific aims of this study were 1) to develop a workflow for driver identification in digital forensics, 2) to apply a simple but sound method for model validation with time series data and 3) to transfer the model results to answers to the two forensic questions a) to which suspect does the evidence most likely belong to and b) how certain is the evidence claim. Based on freely available data (Kwak et al., 2017) the first question could be answered by unsupervised classification using a random forest model validated by random block splitting. To answer the second question we used model accuracy and false detection rate (FDR) which were 93% and 7%, respectively. Furthermore, we reported the random match probability (RMP) as well as the opportunity of a visual interpretation of the prediction on the time series for the evidence data in our hypothetical hit and run accident.", "file_path": "./data/paper/Steinberger/1-s2.0-S2666281720303929-main.pdf", "title": "Driver identification using in-vehicle digital data in the forensic context of a hit and run accident", "abstract": "One major focus in forensics is the identification of individuals based on different kinds of evidence found at a crime scene and in the digital domain. In the present study, we assessed the potential of using in-vehicle digital data to capture the natural driving behavior of individuals in order to identify them. Freely available data was used to classify drivers by their natural driving behavior. We formulated a forensic scenario of a hit and run car accident with three known suspects. Suggestions are provided for an understandable and useful reporting of model results in the light of the requirements in digital forensics. Specific aims of this study were 1) to develop a workflow for driver identification in digital forensics, 2) to apply a simple but sound method for model validation with time series data and 3) to transfer the model results to answers to the two forensic questions a) to which suspect does the evidence most likely belong to and b) how certain is the evidence claim. Based on freely available data (Kwak et al., 2017) the first question could be answered by unsupervised classification using a random forest model validated by random block splitting. To answer the second question we used model accuracy and false detection rate (FDR) which were 93% and 7%, respectively. Furthermore, we reported the random match probability (RMP) as well as the opportunity of a visual interpretation of the prediction on the time series for the evidence data in our hypothetical hit and run accident.", "keywords": [], "author": "Steinberger"}, {"content": "Collaborative DDoS Defense using Flow-based Security Event Information\nOver recent years, network-based attacks evolved to the top concerns responsible for network infrastructure and service outages. To counteract such attacks, an approach is to move mitigation from the target network to the networks of Internet Service Providers (ISP). In addition, exchanging threat information among trusted partners is used to reduce the time needed to detect and respond to large-scale network-based attacks. However, exchanging threat information is currently done on an ad-hoc basis via email or telephone, and there is still no interoperable standard to exchange threat information among trusted partners. To facilitate the exchange of security event information in conjunction with widely adopted monitoring technologies, in particular network flows, we make use of the exchange format FLEX. The goal of this paper is to present a communication process that supports the dissemination of threat information based on FLEX in context of ISPs. We show that this communication process helps organizations to speed up their mitigation and response capabilities without the need to modify the current network infrastructure, and hence make it viable to use for network operators.", "file_path": "./data/paper/Steinberger/steinberger-noms-2016.pdf", "title": "Collaborative DDoS Defense using Flow-based Security Event Information", "abstract": "Over recent years, network-based attacks evolved to the top concerns responsible for network infrastructure and service outages. To counteract such attacks, an approach is to move mitigation from the target network to the networks of Internet Service Providers (ISP). In addition, exchanging threat information among trusted partners is used to reduce the time needed to detect and respond to large-scale network-based attacks. However, exchanging threat information is currently done on an ad-hoc basis via email or telephone, and there is still no interoperable standard to exchange threat information among trusted partners. To facilitate the exchange of security event information in conjunction with widely adopted monitoring technologies, in particular network flows, we make use of the exchange format FLEX. The goal of this paper is to present a communication process that supports the dissemination of threat information based on FLEX in context of ISPs. We show that this communication process helps organizations to speed up their mitigation and response capabilities without the need to modify the current network infrastructure, and hence make it viable to use for network operators.", "keywords": [], "author": "Steinberger"}, {"content": "Towards automated incident handling: How to select an appropriate response against a network-based attack?\nThe increasing amount of network-based attacks evolved to one of the top concerns responsible for network infrastructure and service outages. In order to counteract these threats, computer networks are monitored to detect malicious traffic and initiate suitable reactions. However, initiating a suitable reaction is a process of selecting an appropriate response related to the identified network-based attack. The process of selecting a response requires to take into account the economics of an reaction e.g., risks and benefits. The literature describes several response selection models, but they are not widely adopted. In addition, these models and their evaluation are often not reproducible due to closed testing data. In this paper, we introduce a new response selection model, called REASSESS, that allows to mitigate network-based attacks by incorporating an intuitive response selection process that evaluates negative and positive impacts associated with each countermeasure. We compare REASSESS with the response selection models of IE-IRS, ADEPTS, CS-IRS, and TVA and show that REASSESS is able to select the most appropriate response to an attack in consideration of the positive and negative impacts and thus reduces the effects caused by an network-based attack. Further, we show that REASSESS is aligned to the NIST incident life cycle. We expect REASSESS to help organizations to select the most appropriate response measure against a detected network-based attack, and hence contribute to mitigate them.", "file_path": "./data/paper/Steinberger/Towards_Automated_Incident_Handling_How_to_Select_an_Appropriate_Response_against_a_Network-Based_Attack.pdf", "title": "Towards automated incident handling: How to select an appropriate response against a network-based attack?", "abstract": "The increasing amount of network-based attacks evolved to one of the top concerns responsible for network infrastructure and service outages. In order to counteract these threats, computer networks are monitored to detect malicious traffic and initiate suitable reactions. However, initiating a suitable reaction is a process of selecting an appropriate response related to the identified network-based attack. The process of selecting a response requires to take into account the economics of an reaction e.g., risks and benefits. The literature describes several response selection models, but they are not widely adopted. In addition, these models and their evaluation are often not reproducible due to closed testing data. In this paper, we introduce a new response selection model, called REASSESS, that allows to mitigate network-based attacks by incorporating an intuitive response selection process that evaluates negative and positive impacts associated with each countermeasure. We compare REASSESS with the response selection models of IE-IRS, ADEPTS, CS-IRS, and TVA and show that REASSESS is able to select the most appropriate response to an attack in consideration of the positive and negative impacts and thus reduces the effects caused by an network-based attack. Further, we show that REASSESS is aligned to the NIST incident life cycle. We expect REASSESS to help organizations to select the most appropriate response measure against a detected network-based attack, and hence contribute to mitigate them.", "keywords": ["cyber security", "intrusion response systems", "network security", "automatic mitigation"], "author": "Steinberger"}, {"content": "Distributed DDoS Defense:A collaborative Approach at Internet Scale\nDistributed large-scale cyber attacks targeting the availability of computing and network resources still remain a serious threat. To limit the effects caused by those attacks and to provide a proactive defense, mitigation should move to the networks of Internet Service Providers (ISPs). In this context, this thesis focuses on a development of a collaborative, automated approach to mitigate the effects of Distributed Denial of Service (DDoS) attacks at Internet Scale. This thesis has the following contributions: i) a systematic and multifaceted study on mitigation of large-scale cyber attacks at ISPs. ii) A detailed guidance selecting an exchange format and protocol suitable to use to disseminate threat information. iii) To overcome the shortcomings of missing flow-based interoperability of current exchange formats, a development of the exchange format Flowbased Event Exchange Format (FLEX). iv) A communication process to facilitate the automated defense in response to ongoing network-based attacks, v) a model to select and perform a semi-automatic deployment of suitable response actions. vi) An investigation of the effectiveness of the defense techniques moving-target using Software Defined Networking (SDN) and their applicability in context of large-scale cyber attacks and the networks of ISPs. Finally, a trust model that determines a trust and a knowledge level of a security event to deploy semi-automated remediations and facilitate the dissemination of security event information using the exchange format FLEX in context of ISP networks.", "file_path": "./data/paper/Steinberger/Steinberger2020distributed.pdf", "title": "Distributed DDoS Defense:A collaborative Approach at Internet Scale", "abstract": "Distributed large-scale cyber attacks targeting the availability of computing and network resources still remain a serious threat. To limit the effects caused by those attacks and to provide a proactive defense, mitigation should move to the networks of Internet Service Providers (ISPs). In this context, this thesis focuses on a development of a collaborative, automated approach to mitigate the effects of Distributed Denial of Service (DDoS) attacks at Internet Scale. This thesis has the following contributions: i) a systematic and multifaceted study on mitigation of large-scale cyber attacks at ISPs. ii) A detailed guidance selecting an exchange format and protocol suitable to use to disseminate threat information. iii) To overcome the shortcomings of missing flow-based interoperability of current exchange formats, a development of the exchange format Flowbased Event Exchange Format (FLEX). iv) A communication process to facilitate the automated defense in response to ongoing network-based attacks, v) a model to select and perform a semi-automatic deployment of suitable response actions. vi) An investigation of the effectiveness of the defense techniques moving-target using Software Defined Networking (SDN) and their applicability in context of large-scale cyber attacks and the networks of ISPs. Finally, a trust model that determines a trust and a knowledge level of a security event to deploy semi-automated remediations and facilitate the dissemination of security event information using the exchange format FLEX in context of ISP networks.", "keywords": ["DDoS", "Mitigation", "Reaction", "Dissemination", "future attacks", "attack intensities"], "author": "Steinberger"}, {"content": "FORENSIC DRIVER IDENTIFICATION CONSIDERING AN UNKNOWN SUSPECT\nOne major focus in forensics is the identification of individuals based on different kinds of evidence found at a crime scene and in the digital domain. Here, we assess the potential of using in-vehicle digital data to capture the natural driving behavior of individuals in order to identify them. We formulate a forensic scenario of a hit-and-run car accident with a known and an unknown suspect being the actual driver during the accident. Specific aims of this study are (i) to further develop a workflow for driver identification in digital forensics considering a scenario with an unknown suspect, and (ii) to assess the potential of one-class compared to multi-class classification for this task. The developed workflow demonstrates that in the application of machine learning in digital forensics it is important to decide on the statistical application, data mining or hypothesis testing in advance. Further, multi-class classification is superior to one-class classification in terms of statistical model quality. Using multi-class classification it is possible to contribute to the identification of the driver in the hit-and-run accident in both types of application, data mining and hypothesis testing. Model quality is in the range of already employed methods for forensic identification of individuals.", "file_path": "./data/paper/Steinberger/10.34768_amcs-2021-0040.pdf", "title": "FORENSIC DRIVER IDENTIFICATION CONSIDERING AN UNKNOWN SUSPECT", "abstract": "One major focus in forensics is the identification of individuals based on different kinds of evidence found at a crime scene and in the digital domain. Here, we assess the potential of using in-vehicle digital data to capture the natural driving behavior of individuals in order to identify them. We formulate a forensic scenario of a hit-and-run car accident with a known and an unknown suspect being the actual driver during the accident. Specific aims of this study are (i) to further develop a workflow for driver identification in digital forensics considering a scenario with an unknown suspect, and (ii) to assess the potential of one-class compared to multi-class classification for this task. The developed workflow demonstrates that in the application of machine learning in digital forensics it is important to decide on the statistical application, data mining or hypothesis testing in advance. Further, multi-class classification is superior to one-class classification in terms of statistical model quality. Using multi-class classification it is possible to contribute to the identification of the driver in the hit-and-run accident in both types of application, data mining and hypothesis testing. Model quality is in the range of already employed methods for forensic identification of individuals.", "keywords": ["natural driving behavior", "digital biometry", "OCC", "CAN-BUS data", "validation"], "author": "Steinberger"}, {"content": "Booters and Certificates: An Overview of TLS in the DDoS-as-a-Service Landscape\nDistributed Denial of Service attacks are getting more sophisticated and frequent whereas the required technical knowledge to perform these attacks decreases. The reason is that Distributed Denial of Service attacks are offered as a service, namely Booters, for less than 10 US dollars. As Booters offer a Distributed Denial of Service service that is paid, Booters often make use of Transport Layer Security certificates to appear trusted and hide themselves inside of encrypted traffic in order to evade detection and bypass critical security controls. In addition, Booters use Transport Layer Security certificates to ensure secure credit card transactions, data transfer and logins for their customers. In this article, we review Booters websites and their use of Secure Socket Layer certificates. In particular, we analyze the certificate chain, the used cryptography and cipher suites, protocol use within Transport Layer Security for purpose of security parameters negotiation, the issuer, the validity of the certificate and the hosting companies. Our main finding is that Booters prefer elliptic curve cryptography and are using Advanced Encryption Standard with a 128 bit key in Galois/Counter Mode. Further, we found a typical certificate chain used by most of the Booters.", "file_path": "./data/paper/Steinberger/accse_2017_3_30_80028.pdf", "title": "Booters and Certificates: An Overview of TLS in the DDoS-as-a-Service Landscape", "abstract": "Distributed Denial of Service attacks are getting more sophisticated and frequent whereas the required technical knowledge to perform these attacks decreases. The reason is that Distributed Denial of Service attacks are offered as a service, namely Booters, for less than 10 US dollars. As Booters offer a Distributed Denial of Service service that is paid, Booters often make use of Transport Layer Security certificates to appear trusted and hide themselves inside of encrypted traffic in order to evade detection and bypass critical security controls. In addition, Booters use Transport Layer Security certificates to ensure secure credit card transactions, data transfer and logins for their customers. In this article, we review Booters websites and their use of Secure Socket Layer certificates. In particular, we analyze the certificate chain, the used cryptography and cipher suites, protocol use within Transport Layer Security for purpose of security parameters negotiation, the issuer, the validity of the certificate and the hosting companies. Our main finding is that Booters prefer elliptic curve cryptography and are using Advanced Encryption Standard with a 128 bit key in Galois/Counter Mode. Further, we found a typical certificate chain used by most of the Booters.", "keywords": ["booters", "certificates", "distributed denial of service as a service", "mitigation", "tls"], "author": "Steinberger"}, {"content": "DDoS Defense using MTD and SDN\nDistributed large-scale cyber attacks targeting the availability of computing and network resources still remains a serious threat. In order to limit the effects caused by those attacks and to provide a proactive defense, mitigation should move to the networks of Internet Service Providers. In this context, Moving Target Defense (MTD) is a technique that increases uncertainty due to an ever-changing attack surface. In combination with Software Defined Networking (SDN), MTD has the potential to reduce the effects of a large-scale cyber attack. In this paper, we combine the defense techniques movingtarget using Software Defined Networking and investigate their effectiveness. We review current moving-target defense strategies and their applicability in context of large-scale cyber attacks and the networks of Internet Service Providers. Further, we enforce the implementation of moving target defense strategies using Software Defined Networks in a collaborative environment. In particular, we focus on ISPs that cooperate among trusted partners. We found that the effects of a large-scale cyber attack can be significantly reduced using the moving-target defense and Software Defined Networking. Moreover, we show that Software Defined Networking is an appropriate approach to enforce implementation of the moving target defense and thus mitigate the effects caused by large-scale cyber attacks.", "file_path": "./data/paper/Steinberger/DDoS_defense_using_MTD_and_SDN.pdf", "title": "DDoS Defense using MTD and SDN", "abstract": "Distributed large-scale cyber attacks targeting the availability of computing and network resources still remains a serious threat. In order to limit the effects caused by those attacks and to provide a proactive defense, mitigation should move to the networks of Internet Service Providers. In this context, Moving Target Defense (MTD) is a technique that increases uncertainty due to an ever-changing attack surface. In combination with Software Defined Networking (SDN), MTD has the potential to reduce the effects of a large-scale cyber attack. In this paper, we combine the defense techniques movingtarget using Software Defined Networking and investigate their effectiveness. We review current moving-target defense strategies and their applicability in context of large-scale cyber attacks and the networks of Internet Service Providers. Further, we enforce the implementation of moving target defense strategies using Software Defined Networks in a collaborative environment. In particular, we focus on ISPs that cooperate among trusted partners. We found that the effects of a large-scale cyber attack can be significantly reduced using the moving-target defense and Software Defined Networking. Moreover, we show that Software Defined Networking is an appropriate approach to enforce implementation of the moving target defense and thus mitigate the effects caused by large-scale cyber attacks.", "keywords": [], "author": "Steinberger"}, {"content": "\"LUDO\" -KIDS PLAYING DISTRIBUTED DENIAL OF SERVICE\nDistributed denial of service attacks pose a serious threat to the availability of the network infrastructures and services. G\u00c9ANT, the pan-European network with terabit capacities witnesses close to hundreds of DDoS attacks on a daily basis. The reason is that DDoS attacks are getting larger, more sophisticated and frequent. At the same time, it has never been easier to execute DDoS attacks, e.g., Booter services offer paying customers without any technical knowledge the possibility to perform DDoS attacks as a service. Given the increasing size, frequency and complexity of DDoS attacks, there is a need to perform a collaborative mitigation. Therefore, we developed (i) a DDoSDB to share real attack data and allow collaborators to query, compare, and download attacks, (ii) the Security attack experimentation framework to test mitigation and response capabilities and (iii) a collaborative mitigation and response process among trusted partners to disseminate security event information. In addition to these developments, we present and would like to discuss our latest research results with experienced networking operators and bridging the gap between academic research and operational business.", "file_path": "./data/paper/Steinberger/03-paper-TNC2016-2.pdf", "title": "\"LUDO\" -KIDS PLAYING DISTRIBUTED DENIAL OF SERVICE", "abstract": "Distributed denial of service attacks pose a serious threat to the availability of the network infrastructures and services. G\u00c9ANT, the pan-European network with terabit capacities witnesses close to hundreds of DDoS attacks on a daily basis. The reason is that DDoS attacks are getting larger, more sophisticated and frequent. At the same time, it has never been easier to execute DDoS attacks, e.g., Booter services offer paying customers without any technical knowledge the possibility to perform DDoS attacks as a service. Given the increasing size, frequency and complexity of DDoS attacks, there is a need to perform a collaborative mitigation. Therefore, we developed (i) a DDoSDB to share real attack data and allow collaborators to query, compare, and download attacks, (ii) the Security attack experimentation framework to test mitigation and response capabilities and (iii) a collaborative mitigation and response process among trusted partners to disseminate security event information. In addition to these developments, we present and would like to discuss our latest research results with experienced networking operators and bridging the gap between academic research and operational business.", "keywords": ["Distributed Denial of Service as a Service", "Security attack experimentation framework", "Mitigation and response", "Collaboration Defense", "Testing mitigation and response", "Firewall On Demand", "NSHaRP", "Netflow", "BGP Flowspec"], "author": "Steinberger"}, {"content": "How to Exchange Security Events? Overview and Evaluation of Formats and Protocols\nNetwork-based attacks pose a strong threat to the Internet landscape. Recent approaches to mitigate and resolve these threats focus on cooperation of Internet service providers and their exchange of security event information. A major benefit of a cooperation is that it might counteract a network-based attack at its root and provides the possibility to inform other cooperative partners about the occurrence of anomalous events as a proactive service. In this paper we provide a structured overview of existing exchange formats and protocols. We evaluate and compare the exchange formats and protocols in context of high-speed networks. In particular, we focus on flow data. In addition, we investigate the exchange of potentially sensitive data. For our overview, we review different exchange formats and protocols with respect to their use-case scenario, their interoperability with network flow-based data, their scalability in a high-speed network context and develop a classification.", "file_path": "./data/paper/Steinberger/IM2015.pdf", "title": "How to Exchange Security Events? Overview and Evaluation of Formats and Protocols", "abstract": "Network-based attacks pose a strong threat to the Internet landscape. Recent approaches to mitigate and resolve these threats focus on cooperation of Internet service providers and their exchange of security event information. A major benefit of a cooperation is that it might counteract a network-based attack at its root and provides the possibility to inform other cooperative partners about the occurrence of anomalous events as a proactive service. In this paper we provide a structured overview of existing exchange formats and protocols. We evaluate and compare the exchange formats and protocols in context of high-speed networks. In particular, we focus on flow data. In addition, we investigate the exchange of potentially sensitive data. For our overview, we review different exchange formats and protocols with respect to their use-case scenario, their interoperability with network flow-based data, their scalability in a high-speed network context and develop a classification.", "keywords": [], "author": "Steinberger"}, {"content": "Anomaly Detection and Mitigation at Internet Scale: A Survey\nNetwork-based attacks pose a strong threat to the Internet landscape. There are different possibilities to encounter these threats. On the one hand attack detection operated at the end-users' side, on the other hand attack detection implemented at network operators' infrastructures. An obvious benefit of the second approach is that it counteracts a network-based attack at its root. It is currently unclear to which extent countermeasures are set up at Internet scale and which anomaly detection and mitigation approaches of the community may be adopted by ISPs. We present results of a survey, which aims at gaining insight in industry processes, structures and capabilities of IT companies and the computer networks they run. One result with respect to attack detection is that flow-based detection mechanisms are valuable, because those mechanisms could easily adapt to existing infrastructures. Due to the lack of standardized exchange formats, mitigation across network borders is currently uncommon.", "file_path": "./data/paper/Steinberger/978-3-642-38998-6_7.pdf", "title": "Anomaly Detection and Mitigation at Internet Scale: A Survey", "abstract": "Network-based attacks pose a strong threat to the Internet landscape. There are different possibilities to encounter these threats. On the one hand attack detection operated at the end-users' side, on the other hand attack detection implemented at network operators' infrastructures. An obvious benefit of the second approach is that it counteracts a network-based attack at its root. It is currently unclear to which extent countermeasures are set up at Internet scale and which anomaly detection and mitigation approaches of the community may be adopted by ISPs. We present results of a survey, which aims at gaining insight in industry processes, structures and capabilities of IT companies and the computer networks they run. One result with respect to attack detection is that flow-based detection mechanisms are valuable, because those mechanisms could easily adapt to existing infrastructures. Due to the lack of standardized exchange formats, mitigation across network borders is currently uncommon.", "keywords": ["Anomaly Detection", "Anomaly Mitigation", "Internet Service Provider", "Network Security", "NetFlow", "Correlation"], "author": "Steinberger"}, {"content": "In Whom Do We Trust -Sharing Security Events\nde niveau recherche, publi\u00e9s ou non, \u00e9manant des \u00e9tablissements d'enseignement et de recherche fran\u00e7ais ou \u00e9trangers, des laboratoires publics ou priv\u00e9s.", "file_path": "./data/paper/Steinberger/385745_1_En_11_Chapter.pdf", "title": "In Whom Do We Trust -Sharing Security Events", "abstract": "de niveau recherche, publi\u00e9s ou non, \u00e9manant des \u00e9tablissements d'enseignement et de recherche fran\u00e7ais ou \u00e9trangers, des laboratoires publics ou priv\u00e9s.", "keywords": ["Sharing Security Events", "Attack mitigation", "Internet Service Provider", "Network Security"], "author": "Steinberger"}, {"content": "Modeling Cooperative Business Processes and Transformation to a Service Oriented Architecture\nThe definition and implementation of interdepartment and inter-enterprise value chains requires the integration of staff and organisational units as well as the technological integration of involved IT systems. While for simple document-based standard processes suitable solutions already exist, a continuous methodology to model cooperative inter-organisational business processes and their transformation to adequate IT architectures is still missing. Objective of this paper is to develop a modeling methodology for selected classes of typical cooperation problems. Considering a cooperative service scenario as example, this methodology is described in detail and evaluated.", "file_path": "./data/paper/Specht/Modeling_cooperative_business_processes_and_transformation_to_a_service_oriented_architecture.pdf", "title": "Modeling Cooperative Business Processes and Transformation to a Service Oriented Architecture", "abstract": "The definition and implementation of interdepartment and inter-enterprise value chains requires the integration of staff and organisational units as well as the technological integration of involved IT systems. While for simple document-based standard processes suitable solutions already exist, a continuous methodology to model cooperative inter-organisational business processes and their transformation to adequate IT architectures is still missing. Objective of this paper is to develop a modeling methodology for selected classes of typical cooperation problems. Considering a cooperative service scenario as example, this methodology is described in detail and evaluated.", "keywords": [], "author": "Specht"}, {"content": "Evidence-Based Trustworthiness of Internet-Based Services Through Controlled Software Development\nUsers of Internet-based services are increasingly concerned about the trustworthiness of these services (i.e., apps, software, platforms) thus slowing down their adoption. Therefore, successful software development processes have to address trust concerns from the very early stages of development using constructive and practical methods to enable the trustworthiness of software and services. Unfortunately, even well-established development methodologies do not specifically support the realization of trustworthy Internet-based services today, and trustworthiness-oriented practices do not take objective evidences into account. We propose to use controlled software life-cycle processes for trustworthy Internet-based services. Development, deployment and operations processes, can be controlled by the collection of trustworthiness evidences at different stages. This can be achieved by e.g., measuring the degree of trustworthiness-related properties of the software, and documenting these evidences using digital trustworthiness certificates. This way, other stakeholders are able to verify the trustworthiness properties in a later stages, e.g., in the deployment of software on a marketplace, or the operation of the service at run-time.", "file_path": "./data/paper/Paulus/controlled_software_development.pdf", "title": "Evidence-Based Trustworthiness of Internet-Based Services Through Controlled Software Development", "abstract": "Users of Internet-based services are increasingly concerned about the trustworthiness of these services (i.e., apps, software, platforms) thus slowing down their adoption. Therefore, successful software development processes have to address trust concerns from the very early stages of development using constructive and practical methods to enable the trustworthiness of software and services. Unfortunately, even well-established development methodologies do not specifically support the realization of trustworthy Internet-based services today, and trustworthiness-oriented practices do not take objective evidences into account. We propose to use controlled software life-cycle processes for trustworthy Internet-based services. Development, deployment and operations processes, can be controlled by the collection of trustworthiness evidences at different stages. This can be achieved by e.g., measuring the degree of trustworthiness-related properties of the software, and documenting these evidences using digital trustworthiness certificates. This way, other stakeholders are able to verify the trustworthiness properties in a later stages, e.g., in the deployment of software on a marketplace, or the operation of the service at run-time.", "keywords": ["Trust", "Trustworthiness", "Software Development Methodology", "Digital Trustworthiness Certificate", "Metrics", "Evidences"], "author": "Paulus"}, {"content": "AN APPROACH FOR A BUSINESS-DRIVEN CLOUD-COMPLIANCE ANALYSIS COVERING PUBLIC SECTOR PROCESS IMPROVEMENT REQUIREMENTS\nThe need for process improvement is an important target that does affect as well the government processes. Specifically in the public sector there are specific challenges to face .New technology approaches within government processes such as cloud services are necessary to address these challenges. Following the current discussion of \"cloudification\"of business processes all processes are considered similar in regards to their usability within the cloud. The truth is, that neither all processes have the same usability for cloud services not do they have the same importance for a specific company. The most comprehensive process within a company is the corporate value chain. In this article one key proposition is to use the corporate value chain as the fundamental structuring backbone for all business process analysis and improvement activities. It is a prerequisite to identify the core elements of the value chain that are essential for the individual company's business and the root cause for any company success. In this paper we propose to use the company-specific value-creation for the \"cloud-affinity\" and the \"cloud-usability\" of a business process in public sector considering the specific challenges of addressing processes in cloud services. Therefor it is necessary to formalize the way the processes with its interdependencies are documented in context of their company-specific value chain (as part of the various deployment-and governance alternatives (e.g. security, compliance, quality, adaptability)). Moreover, it is essential in the public sector to describe in detail the environmental / external restrictions of processes.. With the use of this proposed methodology it becomes relatively easy to identify cloud-suitable processes within the public sector and thus optimize the public companies value generation tightly focused with the use of this new technology.", "file_path": "./data/paper/Paulus/1310.2832.pdf", "title": "AN APPROACH FOR A BUSINESS-DRIVEN CLOUD-COMPLIANCE ANALYSIS COVERING PUBLIC SECTOR PROCESS IMPROVEMENT REQUIREMENTS", "abstract": "The need for process improvement is an important target that does affect as well the government processes. Specifically in the public sector there are specific challenges to face .New technology approaches within government processes such as cloud services are necessary to address these challenges. Following the current discussion of \"cloudification\"of business processes all processes are considered similar in regards to their usability within the cloud. The truth is, that neither all processes have the same usability for cloud services not do they have the same importance for a specific company. The most comprehensive process within a company is the corporate value chain. In this article one key proposition is to use the corporate value chain as the fundamental structuring backbone for all business process analysis and improvement activities. It is a prerequisite to identify the core elements of the value chain that are essential for the individual company's business and the root cause for any company success. In this paper we propose to use the company-specific value-creation for the \"cloud-affinity\" and the \"cloud-usability\" of a business process in public sector considering the specific challenges of addressing processes in cloud services. Therefor it is necessary to formalize the way the processes with its interdependencies are documented in context of their company-specific value chain (as part of the various deployment-and governance alternatives (e.g. security, compliance, quality, adaptability)). Moreover, it is essential in the public sector to describe in detail the environmental / external restrictions of processes.. With the use of this proposed methodology it becomes relatively easy to identify cloud-suitable processes within the public sector and thus optimize the public companies value generation tightly focused with the use of this new technology.", "keywords": ["Cloud Services", "Business Processes", "Value Chain", "Compliance", "Public Sector"], "author": "Paulus"}, {"content": "Trustworthy Software Development\net \u00e0 la diffusion de documents scientifiques de niveau recherche, publi\u00e9s ou non, \u00e9manant des \u00e9tablissements d'enseignement et de recherche fran\u00e7ais ou \u00e9trangers, des laboratoires publics ou priv\u00e9s.", "file_path": "./data/paper/Paulus/978-3-642-40779-6_23_Chapter.pdf", "title": "Trustworthy Software Development", "abstract": "et \u00e0 la diffusion de documents scientifiques de niveau recherche, publi\u00e9s ou non, \u00e9manant des \u00e9tablissements d'enseignement et de recherche fran\u00e7ais ou \u00e9trangers, des laboratoires publics ou priv\u00e9s.", "keywords": ["Software development", "Trustworthiness", "Trust", "Trustworthy software", "Trustworthy development practices"], "author": "Paulus"}, {"content": "Extending Software Development Methodologies to Support Trustworthiness-by-Design\nPeople are increasingly concerned about the trustworthiness of software that they use when acting within socio-technical systems. Ideally, software development projects have to address trustworthiness requirements from the very early stages of development using constructive methods to enable trustworthiness-by-design. We analyze the development methodologies with respect to their capabilities for supporting the development of trustworthy software. Our analysis reveals that well-established development methodologies do not specifically support the realization of trustworthy software. Based on findings, we propose a generic mechanism for extending development methodologies by incorporating process chunks that represent best practices and explicitly address the systematical design of trustworthy software. We demonstrate the application of our approach by extending a design methodology to foster the development of trustworthy software for socio-technical systems.", "file_path": "./data/paper/Paulus/paper-28.pdf", "title": "Extending Software Development Methodologies to Support Trustworthiness-by-Design", "abstract": "People are increasingly concerned about the trustworthiness of software that they use when acting within socio-technical systems. Ideally, software development projects have to address trustworthiness requirements from the very early stages of development using constructive methods to enable trustworthiness-by-design. We analyze the development methodologies with respect to their capabilities for supporting the development of trustworthy software. Our analysis reveals that well-established development methodologies do not specifically support the realization of trustworthy software. Based on findings, we propose a generic mechanism for extending development methodologies by incorporating process chunks that represent best practices and explicitly address the systematical design of trustworthy software. We demonstrate the application of our approach by extending a design methodology to foster the development of trustworthy software for socio-technical systems.", "keywords": ["Trustworthiness", "Trustworthiness-by-design", "Software Development Methodology"], "author": "Paulus"}, {"content": "Lattice basis reduction in function elds\nWe present an algorithm for lattice basis reduction in function elds. In contrast to integer lattices, there is a simple algorithm which provably computes a reduced basis in polynomial time. Moreover, this algorithm works only with the coe cients of the polynomials involved, so there is no polynomial arithmetic needed. This algorithm can be generically extended to compute a reduced lattice basis starting from a generating system. Moreover, it can be applied to lattices of integral determinant over the eld of puiseux expansions of a function eld. In that case, this algorithm can be used for computing in Jacobians of curves.", "file_path": "./data/paper/Paulus/document.pdf", "title": "Lattice basis reduction in function elds", "abstract": "We present an algorithm for lattice basis reduction in function elds. In contrast to integer lattices, there is a simple algorithm which provably computes a reduced basis in polynomial time. Moreover, this algorithm works only with the coe cients of the polynomials involved, so there is no polynomial arithmetic needed. This algorithm can be generically extended to compute a reduced lattice basis starting from a generating system. Moreover, it can be applied to lattices of integral determinant over the eld of puiseux expansions of a function eld. In that case, this algorithm can be used for computing in Jacobians of curves.", "keywords": [], "author": "Paulus"}, {"content": "Ubiquitous Learning Applied to Coding\nToday programming is a crucial skill in many disciplines, demanding for an adequate education. Unfortunately, programming education requires a dedicated set of tools (editor, compiler, ...), often forcing the students to use the pre-configured machines at their universities. In an ideal setup, students were able to work on their programming assignments anywhere and on any device. This paper presents an infrastructure and tool set for a bring your own device concept in programming education: lecturers are able to provide applications, data and configurations easily and students can install individualized setups for different lectures and programming languages on their clients with one click. Neither student nor lecturer needs detailed knowledge of the installation or configuration process.", "file_path": "./data/paper/Paulus/3209087.3209104.pdf", "title": "Ubiquitous Learning Applied to Coding", "abstract": "Today programming is a crucial skill in many disciplines, demanding for an adequate education. Unfortunately, programming education requires a dedicated set of tools (editor, compiler, ...), often forcing the students to use the pre-configured machines at their universities. In an ideal setup, students were able to work on their programming assignments anywhere and on any device. This paper presents an infrastructure and tool set for a bring your own device concept in programming education: lecturers are able to provide applications, data and configurations easily and students can install individualized setups for different lectures and programming languages on their clients with one click. Neither student nor lecturer needs detailed knowledge of the installation or configuration process.", "keywords": ["Human-centered computing \u2192 Ubiquitous and mobile computing systems and tools; Ubiquitous Learning", "dynamic configuration", "bring your own device", "virtual environments"], "author": "Paulus"}, {"content": "REAL AND IMAGINARY QUADRATIC REPRESENTATIONS OF HYPERELLIPTIC FUNCTION FIELDS\nA hyperelliptic function field can be always be represented as a real quadratic extension of the rational function field. If at least one of the rational prime divisors is rational over the field of constants, then it also can be represented as an imaginary quadratic extension of the rational function field. The arithmetic in the divisor class group can be realized in the second case by Cantor's algorithm. We show that in the first case one can compute in the divisor class group of the function field using reduced ideals and distances of ideals in the orders involved. Furthermore, we show how the two representations are connected and compare the computational complexity.", "file_path": "./data/paper/Paulus/S0025-5718-99-01066-2.pdf", "title": "REAL AND IMAGINARY QUADRATIC REPRESENTATIONS OF HYPERELLIPTIC FUNCTION FIELDS", "abstract": "A hyperelliptic function field can be always be represented as a real quadratic extension of the rational function field. If at least one of the rational prime divisors is rational over the field of constants, then it also can be represented as an imaginary quadratic extension of the rational function field. The arithmetic in the divisor class group can be realized in the second case by Cantor's algorithm. We show that in the first case one can compute in the divisor class group of the function field using reduced ideals and distances of ideals in the orders involved. Furthermore, we show how the two representations are connected and compare the computational complexity.", "keywords": ["Mathematics Subject Classification. Primary 11R58", "14Q05; Secondary 11R65", "14H05", "14H40 Hyperelliptic curves", "divisor class groups", "real quadratic model"], "author": "Paulus"}, {"content": "An Analysis of Software Quality Attributes and Their Contribution to Trustworthiness\nWhether a software, app, service or infrastructure is trustworthy represents a key success factor for its use and adoption by organizations and end-users. The notion of trustworthiness, though, is actually subject to individual interpretation, e.g. organizations require confidence about how their business critical data is handled whereas end-users may be more concerned about the usability. These concerns manifest as trustworthiness requirements towards modern apps and services. Understanding which Software Quality Attributes (SQA) foster trustworthiness thus becomes an increasingly important piece of knowledge for successful software development. To this end, this paper provides a first attempt to identify SQA, which contribute to trustworthiness. Based on a survey of the literature, we provide a structured overview on SQA and their contribution to trustworthiness. We also identify potential gaps with respect to attributes whose relationship to trustworthiness is understudied such as e.g. accessibility, level of service, etc. Further, we observe that most of the literature studies trustworthiness from a security perspective while there exist limited contributions in studying the social aspects of trustworthiness in computing. We expect this work to contribute to a better understanding of which attributes and characteristics of a software system should be considered to build trustworthy systems.", "file_path": "./data/paper/Paulus/45027.pdf", "title": "An Analysis of Software Quality Attributes and Their Contribution to Trustworthiness", "abstract": "Whether a software, app, service or infrastructure is trustworthy represents a key success factor for its use and adoption by organizations and end-users. The notion of trustworthiness, though, is actually subject to individual interpretation, e.g. organizations require confidence about how their business critical data is handled whereas end-users may be more concerned about the usability. These concerns manifest as trustworthiness requirements towards modern apps and services. Understanding which Software Quality Attributes (SQA) foster trustworthiness thus becomes an increasingly important piece of knowledge for successful software development. To this end, this paper provides a first attempt to identify SQA, which contribute to trustworthiness. Based on a survey of the literature, we provide a structured overview on SQA and their contribution to trustworthiness. We also identify potential gaps with respect to attributes whose relationship to trustworthiness is understudied such as e.g. accessibility, level of service, etc. Further, we observe that most of the literature studies trustworthiness from a security perspective while there exist limited contributions in studying the social aspects of trustworthiness in computing. We expect this work to contribute to a better understanding of which attributes and characteristics of a software system should be considered to build trustworthy systems.", "keywords": ["Trust", "Trustworthiness", "Trustworthiness Attributes (TA)", "Socio-Technical Systems (STS)", "Information"], "author": "Paulus"}, {"content": "A New Public-Key Cryptosystem over a Quadratic Order with Quadratic Decryption Time\nWe present a new cryptosystem based on ideal arithmetic in quadratic orders. The method of our trapdoor is different from the Diffie-Hellman key distribution scheme or the RSA cryptosystem. The plaintext m is encrypted by mp r , where p is a fixed element and r is a random integer, so our proposed cryptosystem is a probabilistic encryption scheme and has the homomorphy property. The most prominent property of our cryptosystem is the cost of the decryption, which is of quadratic bit complexity in the length of the public key. Our implementation shows that it is comparably as fast as the encryption time of the RSA cryptosystem with e = 2 16 + 1. The security of our cryptosystem is closely related to factoring the discriminant of a quadratic order. When we choose appropriate sizes of the parameters, the currently known fast algorithms, for example, the elliptic curve method, the number field sieve, the Hafner-McCurley algorithm, are not applicable. We also discuss that the chosen ciphertext attack is not applicable to our cryptosystem.", "file_path": "./data/paper/Paulus/s001459910010.pdf", "title": "A New Public-Key Cryptosystem over a Quadratic Order with Quadratic Decryption Time", "abstract": "We present a new cryptosystem based on ideal arithmetic in quadratic orders. The method of our trapdoor is different from the Diffie-Hellman key distribution scheme or the RSA cryptosystem. The plaintext m is encrypted by mp r , where p is a fixed element and r is a random integer, so our proposed cryptosystem is a probabilistic encryption scheme and has the homomorphy property. The most prominent property of our cryptosystem is the cost of the decryption, which is of quadratic bit complexity in the length of the public key. Our implementation shows that it is comparably as fast as the encryption time of the RSA cryptosystem with e = 2 16 + 1. The security of our cryptosystem is closely related to factoring the discriminant of a quadratic order. When we choose appropriate sizes of the parameters, the currently known fast algorithms, for example, the elliptic curve method, the number field sieve, the Hafner-McCurley algorithm, are not applicable. We also discuss that the chosen ciphertext attack is not applicable to our cryptosystem.", "keywords": ["Public-key cryptosystem", "Fast decryption", "Quadratic order", "Factoring algorithm", "Chosen ciphertext attack"], "author": "Paulus"}, {"content": "ARITHMETIC ON SUPERELLIPTIC CURVES\nThis paper is concerned with algorithms for computing in the divisor class group of a nonsingular plane curve of the form y n = c(x) which has only one point at infinity. Divisors are represented as ideals, and an ideal reduction algorithm based on lattice reduction is given. We obtain a unique representative for each divisor class and the algorithms for addition and reduction of divisors run in polynomial time. An algorithm is also given for solving the discrete logarithm problem when the curve is defined over a finite field.", "file_path": "./data/paper/Paulus/S0025-5718-00-01297-7.pdf", "title": "ARITHMETIC ON SUPERELLIPTIC CURVES", "abstract": "This paper is concerned with algorithms for computing in the divisor class group of a nonsingular plane curve of the form y n = c(x) which has only one point at infinity. Divisors are represented as ideals, and an ideal reduction algorithm based on lattice reduction is given. We obtain a unique representative for each divisor class and the algorithms for addition and reduction of divisors run in polynomial time. An algorithm is also given for solving the discrete logarithm problem when the curve is defined over a finite field.", "keywords": ["Mathematics Subject Classification. Primary 14Q05", "14H40", "11G20", "11Y16 Superelliptic curve", "divisor class group", "cryptography", "discrete logarithm problem"], "author": "Paulus"}, {"content": "On models of the genetic code generated by binary dichotomic algorithms\nIn this paper we introduce the concept of a BDA-generated model of the genetic code which is based on binary dichotomic algorithms (BDAs). A BDA-generated model is based on binary dichotomic algorithms (BDAs). Such a BDA partitions the set of 64 codons into two disjoint classes of size 32 each and provides a generalization of known partitions like the Rumer dichotomy. We investigate what partitions can be generated when a set of different BDAs is applied sequentially to the set of codons. The search revealed that these models are able to generate code tables with very different numbers of classes ranging from 2 to 64. We have analyzed whether there are models that map the codons to their amino acids. A perfect matching is not possible. However, we present models that describe the standard genetic code with only few errors. There are also models that map all 64 codons uniquely to 64 classes showing that BDAs can be used to identify codons precisely. This could serve as a basis for further mathematical analysis using coding theory, for example. The hypothesis that BDAs might reflect a molecular mechanism taking place in the decoding center of the ribosome is discussed. The scan demonstrated that binary dichotomic partitions are able to model different aspects of the genetic code very well. The search was performed with our tool Beady-A.", "file_path": "./data/paper/Gumbel/1-s2.0-S0303264714002044-main (1).pdf", "title": "On models of the genetic code generated by binary dichotomic algorithms", "abstract": "In this paper we introduce the concept of a BDA-generated model of the genetic code which is based on binary dichotomic algorithms (BDAs). A BDA-generated model is based on binary dichotomic algorithms (BDAs). Such a BDA partitions the set of 64 codons into two disjoint classes of size 32 each and provides a generalization of known partitions like the Rumer dichotomy. We investigate what partitions can be generated when a set of different BDAs is applied sequentially to the set of codons. The search revealed that these models are able to generate code tables with very different numbers of classes ranging from 2 to 64. We have analyzed whether there are models that map the codons to their amino acids. A perfect matching is not possible. However, we present models that describe the standard genetic code with only few errors. There are also models that map all 64 codons uniquely to 64 classes showing that BDAs can be used to identify codons precisely. This could serve as a basis for further mathematical analysis using coding theory, for example. The hypothesis that BDAs might reflect a molecular mechanism taking place in the decoding center of the ribosome is discussed. The scan demonstrated that binary dichotomic partitions are able to model different aspects of the genetic code very well. The search was performed with our tool Beady-A.", "keywords": [], "author": "Gumbel"}, {"content": "ALES: cell lineage analysis and mapping of developmental events\nMotivation: Animals build their bodies by altering the fates of cells. The way in which they do so is reflected in the topology of cell lineages and the fates of terminal cells. Cell lineages should, therefore, contain information about the molecular events that determined them. Here we introduce new tools for visualizing, manipulating, and extracting the information contained in cell lineages. Our tools enable us to analyze very large cell lineages, where previously analyses have only been carried out on cell lineages no larger than a few dozen cells. Results: ALES (A Lineage Evaluation System) allows the display, evaluation and comparison of cell lineages with the aim of identifying molecular and cellular events underlying development. ALES introduces a series of algorithms that locate putative developmental events. The distribution of these predicted events can then be compared to gene expression patterns or other cellular characteristics. In addition, artificial lineages can be generated, or existing lineages modified, according to a range of models, in order to test hypotheses about lineage evolution.", "file_path": "./data/paper/Gumbel/bioinformatics_19_7_851.pdf", "title": "ALES: cell lineage analysis and mapping of developmental events", "abstract": "Motivation: Animals build their bodies by altering the fates of cells. The way in which they do so is reflected in the topology of cell lineages and the fates of terminal cells. Cell lineages should, therefore, contain information about the molecular events that determined them. Here we introduce new tools for visualizing, manipulating, and extracting the information contained in cell lineages. Our tools enable us to analyze very large cell lineages, where previously analyses have only been carried out on cell lineages no larger than a few dozen cells. Results: ALES (A Lineage Evaluation System) allows the display, evaluation and comparison of cell lineages with the aim of identifying molecular and cellular events underlying development. ALES introduces a series of algorithms that locate putative developmental events. The distribution of these predicted events can then be compared to gene expression patterns or other cellular characteristics. In addition, artificial lineages can be generated, or existing lineages modified, according to a range of models, in order to test hypotheses about lineage evolution.", "keywords": [], "author": "Gumbel"}, {"content": "Motif lengths of circular codes in coding sequences\nProtein synthesis is a crucial process in any cell. Translation, in which mRNA is translated into proteins, can lead to several errors, notably frame shifts where the ribosome accidentally skips or re-reads one or more nucleotides. So-called circular codes are capable of discovering frame shifts and their codons can be found disproportionately often in coding sequences. Here, we analyzed motifs of circular codes, i.e. sequences only containing codons of circular codes, in biological and artificial sequences. The lengths of these motifs were compared to a statistical model in order to elucidate if coding sequences contain significantly longer motifs than non-coding sequences. Our findings show that coding sequences indeed show on average greater motif lengths than expected by chance. On the other hand, the motifs are too short for a possible frame shift recognition to take place within an entire coding sequence. This suggests that as much as circular codes might have been used in ancient life forms in order to prevent frame shift errors, it remains to be seen whether they are still functional in current organisms.", "file_path": "./data/paper/Gumbel/1-s2.0-S0022519321001302-main.pdf", "title": "Motif lengths of circular codes in coding sequences", "abstract": "Protein synthesis is a crucial process in any cell. Translation, in which mRNA is translated into proteins, can lead to several errors, notably frame shifts where the ribosome accidentally skips or re-reads one or more nucleotides. So-called circular codes are capable of discovering frame shifts and their codons can be found disproportionately often in coding sequences. Here, we analyzed motifs of circular codes, i.e. sequences only containing codons of circular codes, in biological and artificial sequences. The lengths of these motifs were compared to a statistical model in order to elucidate if coding sequences contain significantly longer motifs than non-coding sequences. Our findings show that coding sequences indeed show on average greater motif lengths than expected by chance. On the other hand, the motifs are too short for a possible frame shift recognition to take place within an entire coding sequence. This suggests that as much as circular codes might have been used in ancient life forms in order to prevent frame shift errors, it remains to be seen whether they are still functional in current organisms.", "keywords": [], "author": "Gumbel"}, {"content": "On models of the genetic code generated by binary dichotomic algorithms\nIn this paper we introduce the concept of a BDA-generated model of the genetic code which is based on binary dichotomic algorithms (BDAs). A BDA-generated model is based on binary dichotomic algorithms (BDAs). Such a BDA partitions the set of 64 codons into two disjoint classes of size 32 each and provides a generalization of known partitions like the Rumer dichotomy. We investigate what partitions can be generated when a set of different BDAs is applied sequentially to the set of codons. The search revealed that these models are able to generate code tables with very different numbers of classes ranging from 2 to 64. We have analyzed whether there are models that map the codons to their amino acids. A perfect matching is not possible. However, we present models that describe the standard genetic code with only few errors. There are also models that map all 64 codons uniquely to 64 classes showing that BDAs can be used to identify codons precisely. This could serve as a basis for further mathematical analysis using coding theory, for example. The hypothesis that BDAs might reflect a molecular mechanism taking place in the decoding center of the ribosome is discussed. The scan demonstrated that binary dichotomic partitions are able to model different aspects of the genetic code very well. The search was performed with our tool Beady-A.", "file_path": "./data/paper/Gumbel/1-s2.0-S0303264714002044-main (2).pdf", "title": "On models of the genetic code generated by binary dichotomic algorithms", "abstract": "In this paper we introduce the concept of a BDA-generated model of the genetic code which is based on binary dichotomic algorithms (BDAs). A BDA-generated model is based on binary dichotomic algorithms (BDAs). Such a BDA partitions the set of 64 codons into two disjoint classes of size 32 each and provides a generalization of known partitions like the Rumer dichotomy. We investigate what partitions can be generated when a set of different BDAs is applied sequentially to the set of codons. The search revealed that these models are able to generate code tables with very different numbers of classes ranging from 2 to 64. We have analyzed whether there are models that map the codons to their amino acids. A perfect matching is not possible. However, we present models that describe the standard genetic code with only few errors. There are also models that map all 64 codons uniquely to 64 classes showing that BDAs can be used to identify codons precisely. This could serve as a basis for further mathematical analysis using coding theory, for example. The hypothesis that BDAs might reflect a molecular mechanism taking place in the decoding center of the ribosome is discussed. The scan demonstrated that binary dichotomic partitions are able to model different aspects of the genetic code very well. The search was performed with our tool Beady-A.", "keywords": [], "author": "Gumbel"}, {"content": "On comparing composition principles of long DNA sequences with those of random ones\nThe revelation of compositional principles of the organization of long DNA sequences is one of the crucial tasks in the study of biosystems. This paper is devoted to the analysis of compositional differences between real DNA sequences and Markov-like randomly generated similar sequences. We formulate, among other things, a generalization of Chargaff's second rule and verify it empirically on DNA sequences of five model organisms taken from Genbank. Moreover, we apply the same frequency analysis to simulated sequences. When comparing the afore mentioned-real and random-sequences, significant similarities, on the one hand, as well as essential differences between them, on the other hand, are revealed and described. The significance and possible origin of these differences, including those from the viewpoint of maximum informativeness of genetic texts, is discussed. Besides, the paper discusses the question of what is a \"long\" DNA sequence and quantifies the choice of length. More precisely, the standard deviations of relative frequencies of bases stabilize from the length of approximately 100 000 bases, whereas the deviations are about three times as large at the length of approximately 25 000 bases.", "file_path": "./data/paper/Gumbel/1-s2.0-S0303264719300590-main.pdf", "title": "On comparing composition principles of long DNA sequences with those of random ones", "abstract": "The revelation of compositional principles of the organization of long DNA sequences is one of the crucial tasks in the study of biosystems. This paper is devoted to the analysis of compositional differences between real DNA sequences and Markov-like randomly generated similar sequences. We formulate, among other things, a generalization of Chargaff's second rule and verify it empirically on DNA sequences of five model organisms taken from Genbank. Moreover, we apply the same frequency analysis to simulated sequences. When comparing the afore mentioned-real and random-sequences, significant similarities, on the one hand, as well as essential differences between them, on the other hand, are revealed and described. The significance and possible origin of these differences, including those from the viewpoint of maximum informativeness of genetic texts, is discussed. Besides, the paper discusses the question of what is a \"long\" DNA sequence and quantifies the choice of length. More precisely, the standard deviations of relative frequencies of bases stabilize from the length of approximately 100 000 bases, whereas the deviations are about three times as large at the length of approximately 25 000 bases.", "keywords": [], "author": "Gumbel"}, {"content": "On models of the genetic code generated by binary dichotomic algorithms\nIn this paper we introduce the concept of a BDA-generated model of the genetic code which is based on binary dichotomic algorithms (BDAs). A BDA-generated model is based on binary dichotomic algorithms (BDAs). Such a BDA partitions the set of 64 codons into two disjoint classes of size 32 each and provides a generalization of known partitions like the Rumer dichotomy. We investigate what partitions can be generated when a set of different BDAs is applied sequentially to the set of codons. The search revealed that these models are able to generate code tables with very different numbers of classes ranging from 2 to 64. We have analyzed whether there are models that map the codons to their amino acids. A perfect matching is not possible. However, we present models that describe the standard genetic code with only few errors. There are also models that map all 64 codons uniquely to 64 classes showing that BDAs can be used to identify codons precisely. This could serve as a basis for further mathematical analysis using coding theory, for example. The hypothesis that BDAs might reflect a molecular mechanism taking place in the decoding center of the ribosome is discussed. The scan demonstrated that binary dichotomic partitions are able to model different aspects of the genetic code very well. The search was performed with our tool Beady-A.", "file_path": "./data/paper/Gumbel/1-s2.0-S0303264714002044-main.pdf", "title": "On models of the genetic code generated by binary dichotomic algorithms", "abstract": "In this paper we introduce the concept of a BDA-generated model of the genetic code which is based on binary dichotomic algorithms (BDAs). A BDA-generated model is based on binary dichotomic algorithms (BDAs). Such a BDA partitions the set of 64 codons into two disjoint classes of size 32 each and provides a generalization of known partitions like the Rumer dichotomy. We investigate what partitions can be generated when a set of different BDAs is applied sequentially to the set of codons. The search revealed that these models are able to generate code tables with very different numbers of classes ranging from 2 to 64. We have analyzed whether there are models that map the codons to their amino acids. A perfect matching is not possible. However, we present models that describe the standard genetic code with only few errors. There are also models that map all 64 codons uniquely to 64 classes showing that BDAs can be used to identify codons precisely. This could serve as a basis for further mathematical analysis using coding theory, for example. The hypothesis that BDAs might reflect a molecular mechanism taking place in the decoding center of the ribosome is discussed. The scan demonstrated that binary dichotomic partitions are able to model different aspects of the genetic code very well. The search was performed with our tool Beady-A.", "keywords": [], "author": "Gumbel"}, {"content": "Robustness against point mutations of genetic code extensions under consideration of wobble-like effects\nMany theories of the evolution of the genetic code assume that the genetic code has always evolved in the direction of increasing the supply of amino acids to be encoded (Barbieri, 2019; Di Giulio, 2005; Wong, 1975). In order to reduce the risk of the formation of a non-functional protein due to point mutations, nature is said to have built in control mechanisms. Using graph theory the authors have investigated in Blazej et al. (2019) if this robustness is optimal in the sense that a different codon-amino acid assignment would not generate a code that is even more robust. At present, efforts to expand the genetic code are very relevant in biotechnological applications, for example, for the synthesis of new drugs (Anderson et al.,", "file_path": "./data/paper/Gumbel/1-s2.0-S0303264721001325-main (1).pdf", "title": "Robustness against point mutations of genetic code extensions under consideration of wobble-like effects", "abstract": "Many theories of the evolution of the genetic code assume that the genetic code has always evolved in the direction of increasing the supply of amino acids to be encoded (Barbieri, 2019; Di Giulio, 2005; Wong, 1975). In order to reduce the risk of the formation of a non-functional protein due to point mutations, nature is said to have built in control mechanisms. Using graph theory the authors have investigated in Blazej et al. (2019) if this robustness is optimal in the sense that a different codon-amino acid assignment would not generate a code that is even more robust. At present, efforts to expand the genetic code are very relevant in biotechnological applications, for example, for the synthesis of new drugs (Anderson et al.,", "keywords": [], "author": "Gumbel"}, {"content": "Computational Analysis of Genetic Code Variations Optimized for the Robustness against Point Mutations with Wobble-like Effects\nIt is believed that the codon-amino acid assignments of the standard genetic code (SGC) help to minimize the negative effects caused by point mutations. All possible point mutations of the genetic code can be represented as a weighted graph with weights that correspond to the probabilities of these mutations. The robustness of a code against point mutations can be described then by means of the so-called conductance measure. This paper quantifies the wobble effect, which was investigated previously by applying the weighted graph approach, and seeks optimal weights using an evolutionary optimization algorithm to maximize the code's robustness. One result of our study is that the robustness of the genetic code is least influenced by mutations in the third position-like with the wobble effect. Moreover, the results clearly demonstrate that point mutations in the first, and even more importantly, in the second base of a codon have a very large influence on the robustness of the genetic code. These results were compared to single nucleotide variants (SNV) in coding sequences which support our findings. Additionally, it was analyzed which structure of a genetic code evolves from random code tables when the robustness is maximized. Our calculations show that the resulting code tables are very close to the standard genetic code. In conclusion, the results illustrate that the robustness against point mutations seems to be an important factor in the evolution of the standard genetic code.", "file_path": "./data/paper/Gumbel/life-11-01338-v2.pdf", "title": "Computational Analysis of Genetic Code Variations Optimized for the Robustness against Point Mutations with Wobble-like Effects", "abstract": "It is believed that the codon-amino acid assignments of the standard genetic code (SGC) help to minimize the negative effects caused by point mutations. All possible point mutations of the genetic code can be represented as a weighted graph with weights that correspond to the probabilities of these mutations. The robustness of a code against point mutations can be described then by means of the so-called conductance measure. This paper quantifies the wobble effect, which was investigated previously by applying the weighted graph approach, and seeks optimal weights using an evolutionary optimization algorithm to maximize the code's robustness. One result of our study is that the robustness of the genetic code is least influenced by mutations in the third position-like with the wobble effect. Moreover, the results clearly demonstrate that point mutations in the first, and even more importantly, in the second base of a codon have a very large influence on the robustness of the genetic code. These results were compared to single nucleotide variants (SNV) in coding sequences which support our findings. Additionally, it was analyzed which structure of a genetic code evolves from random code tables when the robustness is maximized. Our calculations show that the resulting code tables are very close to the standard genetic code. In conclusion, the results illustrate that the robustness against point mutations seems to be an important factor in the evolution of the standard genetic code.", "keywords": ["genetic code", "point mutations", "wobble effect", "evolutionary algorithm"], "author": "Gumbel"}, {"content": "The Quality of Genetic Code Models in Terms of Their Robustness Against Point Mutations\nIn this paper, we investigate the quality of selected models of theoretical genetic codes in terms of their robustness against point mutations. To deal with this problem, we used a graph representation including all possible single nucleotide point mutations occurring in codons, which are building blocks of every protein-coding sequence. Following graph theory, the quality of a given code model is measured using the set conductance property which has a useful biological interpretation. Taking this approach, we found the most robust genetic code structures for a given number of coding blocks. In addition, we tested several properties of genetic code models generated by the binary dichotomic algorithms (BDA) and compared them with randomly generated genetic code models. The results indicate that BDA-generated models possess better properties in terms of the conductance measure than the majority of randomly generated genetic code models and, even more, that BDA-models can achieve the best possible conductance values. Therefore, BDA-generated models are very robust towards changes in encoded information generated by single nucleotide substitutions.", "file_path": "./data/paper/Gumbel/s11538-019-00603-2 (1).pdf", "title": "The Quality of Genetic Code Models in Terms of Their Robustness Against Point Mutations", "abstract": "In this paper, we investigate the quality of selected models of theoretical genetic codes in terms of their robustness against point mutations. To deal with this problem, we used a graph representation including all possible single nucleotide point mutations occurring in codons, which are building blocks of every protein-coding sequence. Following graph theory, the quality of a given code model is measured using the set conductance property which has a useful biological interpretation. Taking this approach, we found the most robust genetic code structures for a given number of coding blocks. In addition, we tested several properties of genetic code models generated by the binary dichotomic algorithms (BDA) and compared them with randomly generated genetic code models. The results indicate that BDA-generated models possess better properties in terms of the conductance measure than the majority of randomly generated genetic code models and, even more, that BDA-models can achieve the best possible conductance values. Therefore, BDA-generated models are very robust towards changes in encoded information generated by single nucleotide substitutions.", "keywords": ["Genetic code", "Dichotomy classes", "Point mutations"], "author": "Gumbel"}, {"content": "Generalizing the Arden Syntax to a Common Clinical Application Language\nThe Arden Syntax for Medical Logic Systems is a standard for encoding and sharing knowledge in the form of Medical Logic Modules (MLMs). Although the Arden Syntax has been designed to meet the requirements of data-driven clinical event monitoring, multiple studies suggest that its language constructs may be suitable for use outside the intended application area and even as a common clinical application language. Such a broader context, however, requires to reconsider some language features. The purpose of this paper is to outline the related modifications on the basis of a generalized Arden Syntax version. The implemented prototype provides multiple adjustments to the standard, such as an option to use programming language constructs without the frame-like MLM structure, a JSON compliant data type system, a means to use MLMs as user-defined functions, and native support of restful web services with integrated data mapping. This study does not aim to promote an actually new language, but a more generic version of the proven Arden Syntax standard. Such an easy-to-understand domain-specific language for common clinical applications might cover multiple additional medical subdomains and serve as a lingua franca for arbitrary clinical algorithms, therefore avoiding a patchwork of multiple all-purpose languages between, and even within, institutions.", "file_path": "./data/paper/Kraus/SHTI247-0675.pdf", "title": "Generalizing the Arden Syntax to a Common Clinical Application Language", "abstract": "The Arden Syntax for Medical Logic Systems is a standard for encoding and sharing knowledge in the form of Medical Logic Modules (MLMs). Although the Arden Syntax has been designed to meet the requirements of data-driven clinical event monitoring, multiple studies suggest that its language constructs may be suitable for use outside the intended application area and even as a common clinical application language. Such a broader context, however, requires to reconsider some language features. The purpose of this paper is to outline the related modifications on the basis of a generalized Arden Syntax version. The implemented prototype provides multiple adjustments to the standard, such as an option to use programming language constructs without the frame-like MLM structure, a JSON compliant data type system, a means to use MLMs as user-defined functions, and native support of restful web services with integrated data mapping. This study does not aim to promote an actually new language, but a more generic version of the proven Arden Syntax standard. Such an easy-to-understand domain-specific language for common clinical applications might cover multiple additional medical subdomains and serve as a lingua franca for arbitrary clinical algorithms, therefore avoiding a patchwork of multiple all-purpose languages between, and even within, institutions.", "keywords": ["Domain-specific language", "Arden Syntax", "Medical Logic Modules"], "author": "Kraus"}, {"content": "Pan-European Data Harmonization for Biobanks in ADOPT BBMRI-ERIC\nelements \u25ba health information interoperability \u25ba biological specimen banks", "file_path": "./data/paper/Kraus/s-0039-1695793.pdf", "title": "Pan-European Data Harmonization for Biobanks in ADOPT BBMRI-ERIC", "abstract": "elements \u25ba health information interoperability \u25ba biological specimen banks", "keywords": [], "author": "Kraus"}, {"content": "Investigating the Capabilities of FHIR Search for Clinical Trial Phenotyping\nClinical trials are the foundation of evidence-based medicine and their computerized support has been a recurring theme in medical informatics. One challenging aspect is the representation of eligibility criteria in a machine-readable format to automate the identification of suitable participants. In this study, we investigate the capabilities for expressing trial eligibility criteria via the search functionality specified in HL7 FHIR, an emerging standard for exchanging healthcare information electronically which also defines a set of operations for searching for health record data. Using a randomly sampled subset of 303 eligibility criteria from ClinicalTrials.gov yielded a 34 % success rate in representing them using the FHIR search semantics. While limitations are present, the FHIR search semantics are a viable tool for supporting preliminary trial eligibility assessment.", "file_path": "./data/paper/Kraus/SHTI253-0003.pdf", "title": "Investigating the Capabilities of FHIR Search for Clinical Trial Phenotyping", "abstract": "Clinical trials are the foundation of evidence-based medicine and their computerized support has been a recurring theme in medical informatics. One challenging aspect is the representation of eligibility criteria in a machine-readable format to automate the identification of suitable participants. In this study, we investigate the capabilities for expressing trial eligibility criteria via the search functionality specified in HL7 FHIR, an emerging standard for exchanging healthcare information electronically which also defines a set of operations for searching for health record data. Using a randomly sampled subset of 303 eligibility criteria from ClinicalTrials.gov yielded a 34 % success rate in representing them using the FHIR search semantics. While limitations are present, the FHIR search semantics are a viable tool for supporting preliminary trial eligibility assessment.", "keywords": ["FHIR", "phenotyping", "clinical trials"], "author": "Kraus"}, {"content": "A method for the graphical modeling of relative temporal constraints\nSearching for patient cohorts in electronic patient data often requires the definition of temporal constraints between the selection criteria. However, beyond a certain degree of temporal complexity, the non-graphical, form-based approaches implemented in current translational research platforms may be limited when modeling such constraints. In our opinion, there is a need for an easily accessible and implementable, fully graphical method for creating temporal queries. We aim to respond to this challenge with a new graphical notation. Based on Allen's time interval algebra, it allows for modeling temporal queries by arranging simple horizontal bars depicting symbolic time intervals. To make our approach applicable to complex temporal patterns, we apply two extensions: with duration intervals, we enable the inference about relative temporal distances between patient events, and with time interval modifiers, we support counting and excluding patient events, as well as constraining numeric values. We describe how to generate database queries from this notation. We provide a prototypical implementation, consisting of a temporal query modeling frontend and an experimental backend that connects to an i2b2 system. We evaluate our modeling approach on the MIMIC-III database to demonstrate that it can be used for modeling typical temporal phenotyping queries.", "file_path": "./data/paper/Kraus/1-s2.0-S1532046419302333-main.pdf", "title": "A method for the graphical modeling of relative temporal constraints", "abstract": "Searching for patient cohorts in electronic patient data often requires the definition of temporal constraints between the selection criteria. However, beyond a certain degree of temporal complexity, the non-graphical, form-based approaches implemented in current translational research platforms may be limited when modeling such constraints. In our opinion, there is a need for an easily accessible and implementable, fully graphical method for creating temporal queries. We aim to respond to this challenge with a new graphical notation. Based on Allen's time interval algebra, it allows for modeling temporal queries by arranging simple horizontal bars depicting symbolic time intervals. To make our approach applicable to complex temporal patterns, we apply two extensions: with duration intervals, we enable the inference about relative temporal distances between patient events, and with time interval modifiers, we support counting and excluding patient events, as well as constraining numeric values. We describe how to generate database queries from this notation. We provide a prototypical implementation, consisting of a temporal query modeling frontend and an experimental backend that connects to an i2b2 system. We evaluate our modeling approach on the MIMIC-III database to demonstrate that it can be used for modeling typical temporal phenotyping queries.", "keywords": [], "author": "Kraus"}, {"content": "Patient Cohort Identification on Time Series Data Using the OMOP Common Data Model\nBackground The identification of patient cohorts for recruiting patients into clinical trials requires an evaluation of study-specific inclusion and exclusion criteria. These criteria are specified depending on corresponding clinical facts. Some of these facts may not be present in the clinical source systems and need to be calculated either in advance or at cohort query runtime (so-called feasibility query). Objectives We use the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) as the repository for our clinical data. However, Atlas, the graphical user interface of OMOP, does not offer the functionality to perform calculations on facts data. Therefore, we were in search for a different approach. The objective of this study is to investigate whether the Arden Syntax can be used for feasibility queries on the OMOP CDM to enable on-the-fly calculations at query runtime, to eliminate the need to precalculate data elements that are involved with researchers' criteria specification. Methods We implemented a service that reads the facts from the OMOP repository and provides it in a form which an Arden Syntax Medical Logic Module (MLM) can process. Then, we implemented an MLM that applies the eligibility criteria to every patient data set and outputs the list of eligible cases (i.e., performs the feasibility query). Results The study resulted in an MLM-based feasibility query that identifies cases of overventilation as an example of how an on-the-fly calculation can be realized. The algorithm is split into two MLMs to provide the reusability of the approach. Conclusion We found that MLMs are a suitable technology for feasibility queries on the OMOP CDM. Our method of performing on-the-fly calculations can be employed with any OMOP instance and without touching existing infrastructure like the Extract, Transform and Load pipeline. Therefore, we think that it is a well-suited method to perform on-the-fly calculations on OMOP.", "file_path": "./data/paper/Kraus/s-0040-1721481.pdf", "title": "Patient Cohort Identification on Time Series Data Using the OMOP Common Data Model", "abstract": "Background The identification of patient cohorts for recruiting patients into clinical trials requires an evaluation of study-specific inclusion and exclusion criteria. These criteria are specified depending on corresponding clinical facts. Some of these facts may not be present in the clinical source systems and need to be calculated either in advance or at cohort query runtime (so-called feasibility query). Objectives We use the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) as the repository for our clinical data. However, Atlas, the graphical user interface of OMOP, does not offer the functionality to perform calculations on facts data. Therefore, we were in search for a different approach. The objective of this study is to investigate whether the Arden Syntax can be used for feasibility queries on the OMOP CDM to enable on-the-fly calculations at query runtime, to eliminate the need to precalculate data elements that are involved with researchers' criteria specification. Methods We implemented a service that reads the facts from the OMOP repository and provides it in a form which an Arden Syntax Medical Logic Module (MLM) can process. Then, we implemented an MLM that applies the eligibility criteria to every patient data set and outputs the list of eligible cases (i.e., performs the feasibility query). Results The study resulted in an MLM-based feasibility query that identifies cases of overventilation as an example of how an on-the-fly calculation can be realized. The algorithm is split into two MLMs to provide the reusability of the approach. Conclusion We found that MLMs are a suitable technology for feasibility queries on the OMOP CDM. Our method of performing on-the-fly calculations can be employed with any OMOP instance and without touching existing infrastructure like the Extract, Transform and Load pipeline. Therefore, we think that it is a well-suited method to perform on-the-fly calculations on OMOP.", "keywords": [], "author": "Kraus"}, {"content": "Integrating Arden-Syntax-based clinical decision support with extended presentation formats into a commercial patient data management system\nThe purpose of this study was to introduce clinical decision support (CDS) that exceeds conventional alerting at tertiary care intensive care units. We investigated physicians' functional CDS requirements in periodic interviews, and analyzed technical interfaces of the existing commercial patient data management system (PDMS). Building on these assessments, we adapted a platform that processes Arden Syntax medical logic modules (MLMs). Clinicians demanded data-driven, user-driven and timedriven execution of MLMs, as well as multiple presentation formats such as tables and graphics. The used PDMS represented a black box insofar as it did not provide standardized interfaces for event notification and external access to patient data; enabling CDS thus required periodically exporting datasets for making them accessible to the invoked Arden engine. A client-server-architecture with a simple browser-based viewer allows users to activate MLM execution and to access CDS results, while an MLM library generates hypertext for diverse presentation targets. The workaround that involves a periodic data replication entails a trade-off between the necessary computational resources and a delay of generated alert messages. Web technologies proved serviceable for reconciling Arden-based CDS functions with alternative presentation formats, including tables, text formatting, graphical outputs, as well as list-based overviews of data from several patients that the native PDMS did not support.", "file_path": "./data/paper/Kraus/s10877-013-9430-0.pdf", "title": "Integrating Arden-Syntax-based clinical decision support with extended presentation formats into a commercial patient data management system", "abstract": "The purpose of this study was to introduce clinical decision support (CDS) that exceeds conventional alerting at tertiary care intensive care units. We investigated physicians' functional CDS requirements in periodic interviews, and analyzed technical interfaces of the existing commercial patient data management system (PDMS). Building on these assessments, we adapted a platform that processes Arden Syntax medical logic modules (MLMs). Clinicians demanded data-driven, user-driven and timedriven execution of MLMs, as well as multiple presentation formats such as tables and graphics. The used PDMS represented a black box insofar as it did not provide standardized interfaces for event notification and external access to patient data; enabling CDS thus required periodically exporting datasets for making them accessible to the invoked Arden engine. A client-server-architecture with a simple browser-based viewer allows users to activate MLM execution and to access CDS results, while an MLM library generates hypertext for diverse presentation targets. The workaround that involves a periodic data replication entails a trade-off between the necessary computational resources and a delay of generated alert messages. Web technologies proved serviceable for reconciling Arden-based CDS functions with alternative presentation formats, including tables, text formatting, graphical outputs, as well as list-based overviews of data from several patients that the native PDMS did not support.", "keywords": ["Arden Syntax", "Clinical decision support", "Clinical event monitoring", "Patient data management system", "Intensive care unit"], "author": "Kraus"}, {"content": "Accessing complex patient data from Arden Syntax Medical Logic Modules\nObjective: Arden Syntax is a standard for representing and sharing medical knowledge in form of independent modules and looks back on a history of 25 years. Its traditional field of application is the monitoring of clinical events such as generating an alert in case of occurrence of a critical laboratory result. Arden Syntax Medical Logic Modules must be able to retrieve patient data from the electronic medical record in order to enable automated decision making. For patient data with a simple structure, for instance a list of laboratory results, or, in a broader view, any patient data with a list or table structure, this mapping process is straightforward. Nevertheless, if patient data are of a complex nested structure the mapping process may become tedious. Two clinical requirements-to process complex microbiology data and to decrease the time between a critical laboratory event and its alerting by monitoring Health Level 7 (HL7) communication-have triggered the investigation of approaches for providing complex patient data from electronic medical records inside Arden Syntax Medical Logic Modules. Methods and materials: The data mapping capabilities of current versions of the Arden Syntax standard as well as interfaces and data mapping capabilities of three different Arden Syntax environments have been analyzed. We found and implemented three different approaches to map a test sample of complex microbiology data for 22 patients and measured their execution times and memory usage. Based on one of these approaches, we mapped entire HL7 messages onto congruent Arden Syntax objects. Results: While current versions of Arden Syntax support the mapping of list and table structures, complex data structures are so far unsupported. We identified three different approaches to map complex data from electronic patient records onto Arden Syntax variables; each of these approaches successfully mapped a test sample of complex microbiology data. The first approach was implemented in Arden Syntax itself, the second one inside the interface component of one of the investigated Arden Syntax environments. The third one was based on deserialization of Extended Markup Language (XML) data. Mean execution times of the approaches to map the test sample were 497 ms, 382 ms, and 84 ms. Peak memory usage amounted to 3 MB, 3 MB, and 6 MB. Conclusion: The most promising approach by far was to map arbitrary XML structures onto congruent complex data types of Arden Syntax through deserialization. This approach is generic insofar as a data mapper based on this approach can transform any patient data provided in appropriate XML format. Therefore it could help overcome a major obstacle for integrating clinical decision support functions into clinical information systems. Theoretically, the deserialization approach would even allow mapping entire patient records onto Arden Syntax objects in one single step. We recommend extending the Arden Syntax specification with an appropriate XML data format.", "file_path": "./data/paper/Kraus/1-s2.0-S0933365715001207-main.pdf", "title": "Accessing complex patient data from Arden Syntax Medical Logic Modules", "abstract": "Objective: Arden Syntax is a standard for representing and sharing medical knowledge in form of independent modules and looks back on a history of 25 years. Its traditional field of application is the monitoring of clinical events such as generating an alert in case of occurrence of a critical laboratory result. Arden Syntax Medical Logic Modules must be able to retrieve patient data from the electronic medical record in order to enable automated decision making. For patient data with a simple structure, for instance a list of laboratory results, or, in a broader view, any patient data with a list or table structure, this mapping process is straightforward. Nevertheless, if patient data are of a complex nested structure the mapping process may become tedious. Two clinical requirements-to process complex microbiology data and to decrease the time between a critical laboratory event and its alerting by monitoring Health Level 7 (HL7) communication-have triggered the investigation of approaches for providing complex patient data from electronic medical records inside Arden Syntax Medical Logic Modules. Methods and materials: The data mapping capabilities of current versions of the Arden Syntax standard as well as interfaces and data mapping capabilities of three different Arden Syntax environments have been analyzed. We found and implemented three different approaches to map a test sample of complex microbiology data for 22 patients and measured their execution times and memory usage. Based on one of these approaches, we mapped entire HL7 messages onto congruent Arden Syntax objects. Results: While current versions of Arden Syntax support the mapping of list and table structures, complex data structures are so far unsupported. We identified three different approaches to map complex data from electronic patient records onto Arden Syntax variables; each of these approaches successfully mapped a test sample of complex microbiology data. The first approach was implemented in Arden Syntax itself, the second one inside the interface component of one of the investigated Arden Syntax environments. The third one was based on deserialization of Extended Markup Language (XML) data. Mean execution times of the approaches to map the test sample were 497 ms, 382 ms, and 84 ms. Peak memory usage amounted to 3 MB, 3 MB, and 6 MB. Conclusion: The most promising approach by far was to map arbitrary XML structures onto congruent complex data types of Arden Syntax through deserialization. This approach is generic insofar as a data mapper based on this approach can transform any patient data provided in appropriate XML format. Therefore it could help overcome a major obstacle for integrating clinical decision support functions into clinical information systems. Theoretically, the deserialization approach would even allow mapping entire patient records onto Arden Syntax objects in one single step. We recommend extending the Arden Syntax specification with an appropriate XML data format.", "keywords": [], "author": "Kraus"}, {"content": "A REST Service for the Visualization of Clinical Time Series Data in the Context of Clinical Decision Support\nBackground: University Hospital Erlangen provides clinical decision support (CDS) functions in the intensive care setting, that are based on the Arden Syntax standard. These CDS functions generate extensive output, including patient data charts. In the course of the migration of our CDS platform we revised the charting tool because although the tool was generally perceived as useful, the clinical users reported several shortcomings. Objective: During the migration of our CDS platform, we aimed at resolving the reported shortcomings and at developing a reusable and parameterizable charting tool, driven by best practices and requirements of local clinicians. Methods: We conducted a requirements analysis with local clinicians and searched the literature for well-established guidelines for clinical charts. Using a charting library, we then implemented the tool based on the found criteria and provided it with a REST interface. Results: The criteria catalog included 18 requirements, all of which were successfully implemented. The new charting tool fully replaced the previous implementation in clinical routine. It also provides a web interface that enables clinicians to configure charts without programming skills. Conclusion: The new charting tool combines local preferences with best practices for visualization of clinical time series data. With its REST interface and reusable design it can be easily integrated in existing CDS platforms.", "file_path": "./data/paper/Kraus/SHTI258-0026.pdf", "title": "A REST Service for the Visualization of Clinical Time Series Data in the Context of Clinical Decision Support", "abstract": "Background: University Hospital Erlangen provides clinical decision support (CDS) functions in the intensive care setting, that are based on the Arden Syntax standard. These CDS functions generate extensive output, including patient data charts. In the course of the migration of our CDS platform we revised the charting tool because although the tool was generally perceived as useful, the clinical users reported several shortcomings. Objective: During the migration of our CDS platform, we aimed at resolving the reported shortcomings and at developing a reusable and parameterizable charting tool, driven by best practices and requirements of local clinicians. Methods: We conducted a requirements analysis with local clinicians and searched the literature for well-established guidelines for clinical charts. Using a charting library, we then implemented the tool based on the found criteria and provided it with a REST interface. Results: The criteria catalog included 18 requirements, all of which were successfully implemented. The new charting tool fully replaced the previous implementation in clinical routine. It also provides a web interface that enables clinicians to configure charts without programming skills. Conclusion: The new charting tool combines local preferences with best practices for visualization of clinical time series data. With its REST interface and reusable design it can be easily integrated in existing CDS platforms.", "keywords": ["clinical decision support", "patient data charts", "visualization"], "author": "Kraus"}, {"content": "Mapping the Entire Record-An Alternative Approach to Data Access from Medical Logic Modules\nObjectives This study aimed to describe an alternative approach for accessing electronic medical records (EMRs) from clinical decision support (CDS) functions based on Arden Syntax Medical Logic Modules, which can be paraphrased as \"map the entire record.\" Methods Based on an experimental Arden Syntax processor, we implemented a method to transform patient data from a commercial patient data management system (PDMS) to tree-structured documents termed CDS EMRs. They are encoded in a specific XML format that can be directly transformed to Arden Syntax data types by a mapper natively integrated into the processor. The internal structure of a CDS EMR reflects the tabbed view of an EMR in the graphical user interface of the PDMS. Results The study resulted in an architecture that provides CDS EMRs in the form of a network service. The approach enables uniform data access from all Medical Logic Modules and requires no mapping parameters except a case number. Measurements within a CDS EMR can be addressed with straightforward path expressions. The approach is in routine use at a German university hospital for more than 2 years. Conclusion This practical approach facilitates the use of CDS functions in the clinical routine at our local hospital. It is transferrable to standard-compliant Arden Syntax processors with moderate effort. Its comprehensibility can also facilitate teaching and development. Moreover, it may lower the entry barrier for the application of the Arden Syntax standard and could therefore promote its dissemination.", "file_path": "./data/paper/Kraus/s-0040-1709708.pdf", "title": "Mapping the Entire Record-An Alternative Approach to Data Access from Medical Logic Modules", "abstract": "Objectives This study aimed to describe an alternative approach for accessing electronic medical records (EMRs) from clinical decision support (CDS) functions based on Arden Syntax Medical Logic Modules, which can be paraphrased as \"map the entire record.\" Methods Based on an experimental Arden Syntax processor, we implemented a method to transform patient data from a commercial patient data management system (PDMS) to tree-structured documents termed CDS EMRs. They are encoded in a specific XML format that can be directly transformed to Arden Syntax data types by a mapper natively integrated into the processor. The internal structure of a CDS EMR reflects the tabbed view of an EMR in the graphical user interface of the PDMS. Results The study resulted in an architecture that provides CDS EMRs in the form of a network service. The approach enables uniform data access from all Medical Logic Modules and requires no mapping parameters except a case number. Measurements within a CDS EMR can be addressed with straightforward path expressions. The approach is in routine use at a German university hospital for more than 2 years. Conclusion This practical approach facilitates the use of CDS functions in the clinical routine at our local hospital. It is transferrable to standard-compliant Arden Syntax processors with moderate effort. Its comprehensibility can also facilitate teaching and development. Moreover, it may lower the entry barrier for the application of the Arden Syntax standard and could therefore promote its dissemination.", "keywords": [], "author": "Kraus"}, {"content": "Secondary use of routinely collected patient data in a clinical trial: An evaluation of the effects on patient recruitment and data acquisition\na l i n f o r m a t i c s 8 2 (2 0 1 3) 185-192 Conclusions: Reuse of routine data can help to improve the quality of patient recruitment and may reduce the time needed for data acquisition. These benefits can exceed the efforts required for development and implementation of the corresponding electronic support systems.", "file_path": "./data/paper/Kraus/Secondary_use_of_routinely_collected_pat.pdf", "title": "Secondary use of routinely collected patient data in a clinical trial: An evaluation of the effects on patient recruitment and data acquisition", "abstract": "a l i n f o r m a t i c s 8 2 (2 0 1 3) 185-192 Conclusions: Reuse of routine data can help to improve the quality of patient recruitment and may reduce the time needed for data acquisition. These benefits can exceed the efforts required for development and implementation of the corresponding electronic support systems.", "keywords": [], "author": "Kraus"}, {"content": "Using Arden Syntax for the creation of a multi-patient surveillance dashboard\nObjective: Most practically deployed Arden-Syntax-based clinical decision support (CDS) modules process data from individual patients. The specification of Arden Syntax, however, would in principle also support multi-patient CDS. The patient data management system (PDMS) at our local intensive care units does not natively support patient overviews from customizable CDS routines, but local physicians indicated a demand for multi-patient tabular overviews of important clinical parameters such as key laboratory measurements. As our PDMS installation provides Arden Syntax support, we set out to explore the capability of Arden Syntax for multi-patient CDS by implementing a prototypical dashboard for visualizing laboratory findings from patient sets. Methods and material: Our implementation leveraged the object data type, supported by later versions of Arden, which turned out to be serviceable for representing complex input data from several patients. For our prototype, we designed a modularized architecture that separates the definition of technical operations, in particular the control of the patient context, from the actual clinical knowledge. Individual Medical Logic Modules (MLMs) for processing single patient attributes could then be developed according to well-tried Arden Syntax conventions. Results: We successfully implemented a working dashboard prototype entirely in Arden Syntax. The architecture consists of a controller MLM to handle the patient context, a presenter MLM to generate a dashboard view, and a set of traditional MLMs containing the clinical decision logic. Our prototype could be integrated into the graphical user interface of the local PDMS. We observed that with realistic input data the average execution time of about 200 ms for generating dashboard views attained applicable performance. Conclusion: Our study demonstrated the general feasibility of creating multi-patient CDS routines in Arden Syntax. We believe that our prototypical dashboard also suggests that such implementations can be relatively easy, and may simultaneously hold promise for sharing dashboards between institutions and reusing elementary components for additional dashboards.", "file_path": "./data/paper/Kraus/1-s2.0-S0933365715001268-main.pdf", "title": "Using Arden Syntax for the creation of a multi-patient surveillance dashboard", "abstract": "Objective: Most practically deployed Arden-Syntax-based clinical decision support (CDS) modules process data from individual patients. The specification of Arden Syntax, however, would in principle also support multi-patient CDS. The patient data management system (PDMS) at our local intensive care units does not natively support patient overviews from customizable CDS routines, but local physicians indicated a demand for multi-patient tabular overviews of important clinical parameters such as key laboratory measurements. As our PDMS installation provides Arden Syntax support, we set out to explore the capability of Arden Syntax for multi-patient CDS by implementing a prototypical dashboard for visualizing laboratory findings from patient sets. Methods and material: Our implementation leveraged the object data type, supported by later versions of Arden, which turned out to be serviceable for representing complex input data from several patients. For our prototype, we designed a modularized architecture that separates the definition of technical operations, in particular the control of the patient context, from the actual clinical knowledge. Individual Medical Logic Modules (MLMs) for processing single patient attributes could then be developed according to well-tried Arden Syntax conventions. Results: We successfully implemented a working dashboard prototype entirely in Arden Syntax. The architecture consists of a controller MLM to handle the patient context, a presenter MLM to generate a dashboard view, and a set of traditional MLMs containing the clinical decision logic. Our prototype could be integrated into the graphical user interface of the local PDMS. We observed that with realistic input data the average execution time of about 200 ms for generating dashboard views attained applicable performance. Conclusion: Our study demonstrated the general feasibility of creating multi-patient CDS routines in Arden Syntax. We believe that our prototypical dashboard also suggests that such implementations can be relatively easy, and may simultaneously hold promise for sharing dashboards between institutions and reusing elementary components for additional dashboards.", "keywords": [], "author": "Kraus"}, {"content": "On models of the genetic code generated by binary dichotomic algorithms\nIn this paper we introduce the concept of a BDA-generated model of the genetic code which is based on binary dichotomic algorithms (BDAs). A BDA-generated model is based on binary dichotomic algorithms (BDAs). Such a BDA partitions the set of 64 codons into two disjoint classes of size 32 each and provides a generalization of known partitions like the Rumer dichotomy. We investigate what partitions can be generated when a set of different BDAs is applied sequentially to the set of codons. The search revealed that these models are able to generate code tables with very different numbers of classes ranging from 2 to 64. We have analyzed whether there are models that map the codons to their amino acids. A perfect matching is not possible. However, we present models that describe the standard genetic code with only few errors. There are also models that map all 64 codons uniquely to 64 classes showing that BDAs can be used to identify codons precisely. This could serve as a basis for further mathematical analysis using coding theory, for example. The hypothesis that BDAs might reflect a molecular mechanism taking place in the decoding center of the ribosome is discussed. The scan demonstrated that binary dichotomic partitions are able to model different aspects of the genetic code very well. The search was performed with our tool Beady-A.", "file_path": "./data/paper/Fimmel/1-s2.0-S0303264714002044-main (1).pdf", "title": "On models of the genetic code generated by binary dichotomic algorithms", "abstract": "In this paper we introduce the concept of a BDA-generated model of the genetic code which is based on binary dichotomic algorithms (BDAs). A BDA-generated model is based on binary dichotomic algorithms (BDAs). Such a BDA partitions the set of 64 codons into two disjoint classes of size 32 each and provides a generalization of known partitions like the Rumer dichotomy. We investigate what partitions can be generated when a set of different BDAs is applied sequentially to the set of codons. The search revealed that these models are able to generate code tables with very different numbers of classes ranging from 2 to 64. We have analyzed whether there are models that map the codons to their amino acids. A perfect matching is not possible. However, we present models that describe the standard genetic code with only few errors. There are also models that map all 64 codons uniquely to 64 classes showing that BDAs can be used to identify codons precisely. This could serve as a basis for further mathematical analysis using coding theory, for example. The hypothesis that BDAs might reflect a molecular mechanism taking place in the decoding center of the ribosome is discussed. The scan demonstrated that binary dichotomic partitions are able to model different aspects of the genetic code very well. The search was performed with our tool Beady-A.", "keywords": [], "author": "Fimmel"}, {"content": "Circular Tessera Codes in the Evolution of the Genetic Code\nThe origin of the modern genetic code and the mechanisms that have contributed to its present form raise many questions. The main goal of this work is to test two hypotheses concerning the development of the genetic code for their compatibility and complementarity and see if they could benefit from each other. On the one hand, Gonzalez, Giannerini and Rosa developed a theory, based on four-based codons, which they called tesserae. This theory can explain the degeneracy of the modern vertebrate mitochondrial code. On the other hand, in the 1990s, so-called circular codes were discovered in nature, which seem to ensure the maintenance of a correct readingframe during the translation process. It turns out that the two concepts not only do not contradict each other, but on the contrary complement and enrichen each other.", "file_path": "./data/paper/Fimmel/Circular_Tessera_Codes_in_the_Evolution_of_the_Gen.pdf", "title": "Circular Tessera Codes in the Evolution of the Genetic Code", "abstract": "The origin of the modern genetic code and the mechanisms that have contributed to its present form raise many questions. The main goal of this work is to test two hypotheses concerning the development of the genetic code for their compatibility and complementarity and see if they could benefit from each other. On the one hand, Gonzalez, Giannerini and Rosa developed a theory, based on four-based codons, which they called tesserae. This theory can explain the degeneracy of the modern vertebrate mitochondrial code. On the other hand, in the 1990s, so-called circular codes were discovered in nature, which seem to ensure the maintenance of a correct readingframe during the translation process. It turns out that the two concepts not only do not contradict each other, but on the contrary complement and enrichen each other.", "keywords": ["Genetic code", "Degeneracy", "Circular code", "Tessera"], "author": "Fimmel"}, {"content": "Equivalence classes of circular codes induced by permutation groups\nIn the 1950s, Crick proposed the concept of so-called comma-free codes as an answer to the frame-shift problem that biologists have encountered when studying the process of translating a sequence of nucleotide bases into a protein. A little later it turned out that this proposal unfortunately does not correspond to biological reality. However, in the mid-90s, a weaker version of comma-free codes, so-called circular codes, was discovered in nature in J Theor Biol 182:45-58, 1996. Circular codes allow to retrieve the reading frame during the translational process in the ribosome and surprisingly the circular code discovered in nature is even circular in all three possible reading-frames (C 3-property). Moreover, it is maximal in the sense that it contains 20 codons and is self-complementary which means that it consists of pairs of codons and corresponding anticodons. In further investigations, it was found that there are exactly 216 codes that have the same strong properties as the originally found code from J Theor Biol 182:45-58. Using an algebraic approach, it was shown in J Math Biol, 2004 that the class of 216 maximal self-complementary C 3-codes can be partitioned into 27 equally sized equivalence classes by the action of a transformation group L \u2286 S 4 which is isomorphic to the dihedral group. Here, we extend the above findings to circular codes over a finite alphabet of even cardinality |\u03a3| = 2n for n \u2208 \u2115. We describe the corresponding group L n using matrices and we investigate what classes of circular codes are split into equally sized equivalence classes under the natural equivalence relation induced by L n. Surprisingly, this is not always the case. All results and constructions are illustrated by examples.", "file_path": "./data/paper/Fimmel/Equivalence_classes_of_circular_codes_induced_by_p.pdf", "title": "Equivalence classes of circular codes induced by permutation groups", "abstract": "In the 1950s, Crick proposed the concept of so-called comma-free codes as an answer to the frame-shift problem that biologists have encountered when studying the process of translating a sequence of nucleotide bases into a protein. A little later it turned out that this proposal unfortunately does not correspond to biological reality. However, in the mid-90s, a weaker version of comma-free codes, so-called circular codes, was discovered in nature in J Theor Biol 182:45-58, 1996. Circular codes allow to retrieve the reading frame during the translational process in the ribosome and surprisingly the circular code discovered in nature is even circular in all three possible reading-frames (C 3-property). Moreover, it is maximal in the sense that it contains 20 codons and is self-complementary which means that it consists of pairs of codons and corresponding anticodons. In further investigations, it was found that there are exactly 216 codes that have the same strong properties as the originally found code from J Theor Biol 182:45-58. Using an algebraic approach, it was shown in J Math Biol, 2004 that the class of 216 maximal self-complementary C 3-codes can be partitioned into 27 equally sized equivalence classes by the action of a transformation group L \u2286 S 4 which is isomorphic to the dihedral group. Here, we extend the above findings to circular codes over a finite alphabet of even cardinality |\u03a3| = 2n for n \u2208 \u2115. We describe the corresponding group L n using matrices and we investigate what classes of circular codes are split into equally sized equivalence classes under the natural equivalence relation induced by L n. Surprisingly, this is not always the case. All results and constructions are illustrated by examples.", "keywords": ["Circular codes", "Comma-free codes", "Symmetric group", "Frame retrieval", "Translation", "Signal processing", "Symmetric group"], "author": "Fimmel"}, {"content": "Computational Analysis of Genetic Code Variations Optimized for the Robustness against Point Mutations with Wobble-like Effects\nIt is believed that the codon-amino acid assignments of the standard genetic code (SGC) help to minimize the negative effects caused by point mutations. All possible point mutations of the genetic code can be represented as a weighted graph with weights that correspond to the probabilities of these mutations. The robustness of a code against point mutations can be described then by means of the so-called conductance measure. This paper quantifies the wobble effect, which was investigated previously by applying the weighted graph approach, and seeks optimal weights using an evolutionary optimization algorithm to maximize the code's robustness. One result of our study is that the robustness of the genetic code is least influenced by mutations in the third position-like with the wobble effect. Moreover, the results clearly demonstrate that point mutations in the first, and even more importantly, in the second base of a codon have a very large influence on the robustness of the genetic code. These results were compared to single nucleotide variants (SNV) in coding sequences which support our findings. Additionally, it was analyzed which structure of a genetic code evolves from random code tables when the robustness is maximized. Our calculations show that the resulting code tables are very close to the standard genetic code. In conclusion, the results illustrate that the robustness against point mutations seems to be an important factor in the evolution of the standard genetic code.", "file_path": "./data/paper/Fimmel/Computational_Analysis_of_Genetic_Code_Variations_.pdf", "title": "Computational Analysis of Genetic Code Variations Optimized for the Robustness against Point Mutations with Wobble-like Effects", "abstract": "It is believed that the codon-amino acid assignments of the standard genetic code (SGC) help to minimize the negative effects caused by point mutations. All possible point mutations of the genetic code can be represented as a weighted graph with weights that correspond to the probabilities of these mutations. The robustness of a code against point mutations can be described then by means of the so-called conductance measure. This paper quantifies the wobble effect, which was investigated previously by applying the weighted graph approach, and seeks optimal weights using an evolutionary optimization algorithm to maximize the code's robustness. One result of our study is that the robustness of the genetic code is least influenced by mutations in the third position-like with the wobble effect. Moreover, the results clearly demonstrate that point mutations in the first, and even more importantly, in the second base of a codon have a very large influence on the robustness of the genetic code. These results were compared to single nucleotide variants (SNV) in coding sequences which support our findings. Additionally, it was analyzed which structure of a genetic code evolves from random code tables when the robustness is maximized. Our calculations show that the resulting code tables are very close to the standard genetic code. In conclusion, the results illustrate that the robustness against point mutations seems to be an important factor in the evolution of the standard genetic code.", "keywords": ["genetic code", "point mutations", "wobble effect", "evolutionary algorithm"], "author": "Fimmel"}, {"content": "Robustness against point mutations of genetic code extensions under consideration of wobble-like effects\nMany theories of the evolution of the genetic code assume that the genetic code has always evolved in the direction of increasing the supply of amino acids to be encoded (Barbieri, 2019; Di Giulio, 2005; Wong, 1975). In order to reduce the risk of the formation of a non-functional protein due to point mutations, nature is said to have built in control mechanisms. Using graph theory the authors have investigated in Blazej et al. (2019) if this robustness is optimal in the sense that a different codon-amino acid assignment would not generate a code that is even more robust. At present, efforts to expand the genetic code are very relevant in biotechnological applications, for example, for the synthesis of new drugs (Anderson et al.,", "file_path": "./data/paper/Fimmel/1-s2.0-S0303264721001325-main.pdf", "title": "Robustness against point mutations of genetic code extensions under consideration of wobble-like effects", "abstract": "Many theories of the evolution of the genetic code assume that the genetic code has always evolved in the direction of increasing the supply of amino acids to be encoded (Barbieri, 2019; Di Giulio, 2005; Wong, 1975). In order to reduce the risk of the formation of a non-functional protein due to point mutations, nature is said to have built in control mechanisms. Using graph theory the authors have investigated in Blazej et al. (2019) if this robustness is optimal in the sense that a different codon-amino acid assignment would not generate a code that is even more robust. At present, efforts to expand the genetic code are very relevant in biotechnological applications, for example, for the synthesis of new drugs (Anderson et al.,", "keywords": [], "author": "Fimmel"}, {"content": "The Quality of Genetic Code Models in Terms of Their Robustness Against Point Mutations\nIn this paper, we investigate the quality of selected models of theoretical genetic codes in terms of their robustness against point mutations. To deal with this problem, we used a graph representation including all possible single nucleotide point mutations occurring in codons, which are building blocks of every protein-coding sequence. Following graph theory, the quality of a given code model is measured using the set conductance property which has a useful biological interpretation. Taking this approach, we found the most robust genetic code structures for a given number of coding blocks. In addition, we tested several properties of genetic code models generated by the binary dichotomic algorithms (BDA) and compared them with randomly generated genetic code models. The results indicate that BDA-generated models possess better properties in terms of the conductance measure than the majority of randomly generated genetic code models and, even more, that BDA-models can achieve the best possible conductance values. Therefore, BDA-generated models are very robust towards changes in encoded information generated by single nucleotide substitutions.", "file_path": "./data/paper/Fimmel/s11538-019-00603-2.pdf", "title": "The Quality of Genetic Code Models in Terms of Their Robustness Against Point Mutations", "abstract": "In this paper, we investigate the quality of selected models of theoretical genetic codes in terms of their robustness against point mutations. To deal with this problem, we used a graph representation including all possible single nucleotide point mutations occurring in codons, which are building blocks of every protein-coding sequence. Following graph theory, the quality of a given code model is measured using the set conductance property which has a useful biological interpretation. Taking this approach, we found the most robust genetic code structures for a given number of coding blocks. In addition, we tested several properties of genetic code models generated by the binary dichotomic algorithms (BDA) and compared them with randomly generated genetic code models. The results indicate that BDA-generated models possess better properties in terms of the conductance measure than the majority of randomly generated genetic code models and, even more, that BDA-models can achieve the best possible conductance values. Therefore, BDA-generated models are very robust towards changes in encoded information generated by single nucleotide substitutions.", "keywords": ["Genetic code", "Dichotomy classes", "Point mutations"], "author": "Fimmel"}, {"content": "Computational Analysis of Genetic Code Variations Optimized for the Robustness against Point Mutations with Wobble-like Effects\nIt is believed that the codon-amino acid assignments of the standard genetic code (SGC) help to minimize the negative effects caused by point mutations. All possible point mutations of the genetic code can be represented as a weighted graph with weights that correspond to the probabilities of these mutations. The robustness of a code against point mutations can be described then by means of the so-called conductance measure. This paper quantifies the wobble effect, which was investigated previously by applying the weighted graph approach, and seeks optimal weights using an evolutionary optimization algorithm to maximize the code's robustness. One result of our study is that the robustness of the genetic code is least influenced by mutations in the third position-like with the wobble effect. Moreover, the results clearly demonstrate that point mutations in the first, and even more importantly, in the second base of a codon have a very large influence on the robustness of the genetic code. These results were compared to single nucleotide variants (SNV) in coding sequences which support our findings. Additionally, it was analyzed which structure of a genetic code evolves from random code tables when the robustness is maximized. Our calculations show that the resulting code tables are very close to the standard genetic code. In conclusion, the results illustrate that the robustness against point mutations seems to be an important factor in the evolution of the standard genetic code.", "file_path": "./data/paper/Fimmel/life-11-01338-v2 (1).pdf", "title": "Computational Analysis of Genetic Code Variations Optimized for the Robustness against Point Mutations with Wobble-like Effects", "abstract": "It is believed that the codon-amino acid assignments of the standard genetic code (SGC) help to minimize the negative effects caused by point mutations. All possible point mutations of the genetic code can be represented as a weighted graph with weights that correspond to the probabilities of these mutations. The robustness of a code against point mutations can be described then by means of the so-called conductance measure. This paper quantifies the wobble effect, which was investigated previously by applying the weighted graph approach, and seeks optimal weights using an evolutionary optimization algorithm to maximize the code's robustness. One result of our study is that the robustness of the genetic code is least influenced by mutations in the third position-like with the wobble effect. Moreover, the results clearly demonstrate that point mutations in the first, and even more importantly, in the second base of a codon have a very large influence on the robustness of the genetic code. These results were compared to single nucleotide variants (SNV) in coding sequences which support our findings. Additionally, it was analyzed which structure of a genetic code evolves from random code tables when the robustness is maximized. Our calculations show that the resulting code tables are very close to the standard genetic code. In conclusion, the results illustrate that the robustness against point mutations seems to be an important factor in the evolution of the standard genetic code.", "keywords": ["genetic code", "point mutations", "wobble effect", "evolutionary algorithm"], "author": "Fimmel"}, {"content": "The Relation Between k-Circularity and Circularity of Codes\nA code X is k-circular if any concatenation of at most k words from X , when read on a circle, admits exactly one partition into words from X. It is circular if it is k-circular for every integer k. While it is not a priori clear from the definition, there exists, for every pair (n,), an integer k such that every k-circular-letter code over an alphabet of cardinality n is circular, and we determine the least such integer k for all values of n and. The k-circular codes may represent an important evolutionary step between the circular codes, such as the comma-free codes, and the genetic code.", "file_path": "./data/paper/Fimmel/The_Relation_Between_k-Circularity_and_Circularity.pdf", "title": "The Relation Between k-Circularity and Circularity of Codes", "abstract": "A code X is k-circular if any concatenation of at most k words from X , when read on a circle, admits exactly one partition into words from X. It is circular if it is k-circular for every integer k. While it is not a priori clear from the definition, there exists, for every pair (n,), an integer k such that every k-circular-letter code over an alphabet of cardinality n is circular, and we determine the least such integer k for all values of n and. The k-circular codes may represent an important evolutionary step between the circular codes, such as the comma-free codes, and the genetic code.", "keywords": ["Circular code", "k-circular code", "Genetic code", "Code evolution"], "author": "Fimmel"}, {"content": "On models of the genetic code generated by binary dichotomic algorithms\nIn this paper we introduce the concept of a BDA-generated model of the genetic code which is based on binary dichotomic algorithms (BDAs). A BDA-generated model is based on binary dichotomic algorithms (BDAs). Such a BDA partitions the set of 64 codons into two disjoint classes of size 32 each and provides a generalization of known partitions like the Rumer dichotomy. We investigate what partitions can be generated when a set of different BDAs is applied sequentially to the set of codons. The search revealed that these models are able to generate code tables with very different numbers of classes ranging from 2 to 64. We have analyzed whether there are models that map the codons to their amino acids. A perfect matching is not possible. However, we present models that describe the standard genetic code with only few errors. There are also models that map all 64 codons uniquely to 64 classes showing that BDAs can be used to identify codons precisely. This could serve as a basis for further mathematical analysis using coding theory, for example. The hypothesis that BDAs might reflect a molecular mechanism taking place in the decoding center of the ribosome is discussed. The scan demonstrated that binary dichotomic partitions are able to model different aspects of the genetic code very well. The search was performed with our tool Beady-A.", "file_path": "./data/paper/Fimmel/1-s2.0-S0303264714002044-main.pdf", "title": "On models of the genetic code generated by binary dichotomic algorithms", "abstract": "In this paper we introduce the concept of a BDA-generated model of the genetic code which is based on binary dichotomic algorithms (BDAs). A BDA-generated model is based on binary dichotomic algorithms (BDAs). Such a BDA partitions the set of 64 codons into two disjoint classes of size 32 each and provides a generalization of known partitions like the Rumer dichotomy. We investigate what partitions can be generated when a set of different BDAs is applied sequentially to the set of codons. The search revealed that these models are able to generate code tables with very different numbers of classes ranging from 2 to 64. We have analyzed whether there are models that map the codons to their amino acids. A perfect matching is not possible. However, we present models that describe the standard genetic code with only few errors. There are also models that map all 64 codons uniquely to 64 classes showing that BDAs can be used to identify codons precisely. This could serve as a basis for further mathematical analysis using coding theory, for example. The hypothesis that BDAs might reflect a molecular mechanism taking place in the decoding center of the ribosome is discussed. The scan demonstrated that binary dichotomic partitions are able to model different aspects of the genetic code very well. The search was performed with our tool Beady-A.", "keywords": [], "author": "Fimmel"}, {"content": "On comparing composition principles of long DNA sequences with those of random ones\nThe revelation of compositional principles of the organization of long DNA sequences is one of the crucial tasks in the study of biosystems. This paper is devoted to the analysis of compositional differences between real DNA sequences and Markov-like randomly generated similar sequences. We formulate, among other things, a generalization of Chargaff's second rule and verify it empirically on DNA sequences of five model organisms taken from Genbank. Moreover, we apply the same frequency analysis to simulated sequences. When comparing the afore mentioned-real and random-sequences, significant similarities, on the one hand, as well as essential differences between them, on the other hand, are revealed and described. The significance and possible origin of these differences, including those from the viewpoint of maximum informativeness of genetic texts, is discussed. Besides, the paper discusses the question of what is a \"long\" DNA sequence and quantifies the choice of length. More precisely, the standard deviations of relative frequencies of bases stabilize from the length of approximately 100 000 bases, whereas the deviations are about three times as large at the length of approximately 25 000 bases.", "file_path": "./data/paper/Fimmel/1-s2.0-S0303264719300590-main (1).pdf", "title": "On comparing composition principles of long DNA sequences with those of random ones", "abstract": "The revelation of compositional principles of the organization of long DNA sequences is one of the crucial tasks in the study of biosystems. This paper is devoted to the analysis of compositional differences between real DNA sequences and Markov-like randomly generated similar sequences. We formulate, among other things, a generalization of Chargaff's second rule and verify it empirically on DNA sequences of five model organisms taken from Genbank. Moreover, we apply the same frequency analysis to simulated sequences. When comparing the afore mentioned-real and random-sequences, significant similarities, on the one hand, as well as essential differences between them, on the other hand, are revealed and described. The significance and possible origin of these differences, including those from the viewpoint of maximum informativeness of genetic texts, is discussed. Besides, the paper discusses the question of what is a \"long\" DNA sequence and quantifies the choice of length. More precisely, the standard deviations of relative frequencies of bases stabilize from the length of approximately 100 000 bases, whereas the deviations are about three times as large at the length of approximately 25 000 bases.", "keywords": [], "author": "Fimmel"}, {"content": "On models of the genetic code generated by binary dichotomic algorithms\nIn this paper we introduce the concept of a BDA-generated model of the genetic code which is based on binary dichotomic algorithms (BDAs). A BDA-generated model is based on binary dichotomic algorithms (BDAs). Such a BDA partitions the set of 64 codons into two disjoint classes of size 32 each and provides a generalization of known partitions like the Rumer dichotomy. We investigate what partitions can be generated when a set of different BDAs is applied sequentially to the set of codons. The search revealed that these models are able to generate code tables with very different numbers of classes ranging from 2 to 64. We have analyzed whether there are models that map the codons to their amino acids. A perfect matching is not possible. However, we present models that describe the standard genetic code with only few errors. There are also models that map all 64 codons uniquely to 64 classes showing that BDAs can be used to identify codons precisely. This could serve as a basis for further mathematical analysis using coding theory, for example. The hypothesis that BDAs might reflect a molecular mechanism taking place in the decoding center of the ribosome is discussed. The scan demonstrated that binary dichotomic partitions are able to model different aspects of the genetic code very well. The search was performed with our tool Beady-A.", "file_path": "./data/paper/Fimmel/1-s2.0-S0303264714002044-main (3).pdf", "title": "On models of the genetic code generated by binary dichotomic algorithms", "abstract": "In this paper we introduce the concept of a BDA-generated model of the genetic code which is based on binary dichotomic algorithms (BDAs). A BDA-generated model is based on binary dichotomic algorithms (BDAs). Such a BDA partitions the set of 64 codons into two disjoint classes of size 32 each and provides a generalization of known partitions like the Rumer dichotomy. We investigate what partitions can be generated when a set of different BDAs is applied sequentially to the set of codons. The search revealed that these models are able to generate code tables with very different numbers of classes ranging from 2 to 64. We have analyzed whether there are models that map the codons to their amino acids. A perfect matching is not possible. However, we present models that describe the standard genetic code with only few errors. There are also models that map all 64 codons uniquely to 64 classes showing that BDAs can be used to identify codons precisely. This could serve as a basis for further mathematical analysis using coding theory, for example. The hypothesis that BDAs might reflect a molecular mechanism taking place in the decoding center of the ribosome is discussed. The scan demonstrated that binary dichotomic partitions are able to model different aspects of the genetic code very well. The search was performed with our tool Beady-A.", "keywords": [], "author": "Fimmel"}, {"content": "Citation Database: Enabling Libraries to Contribute to an Open and Interconnected Citation Graph\nCitations play a crucial role in the scientific discourse, in information retrieval, and in bibliometrics. Many initiatives are currently promoting the idea of having free and open citation data. Creation of citation data, however, is not part of the cataloging workflow in libraries nowadays. In this paper, we present our project Linked Open Citation Database, in which we design distributed processes and a system infrastructure based on linked data technology. The goal is to show that efficiently cataloging citations in libraries using a semi-automatic approach is possible. We specifically describe the current state of the workflow and its implementation. We show that we could significantly improve the automatic reference extraction that is crucial for the subsequent data curation. We further give insights on the curation and linking process and provide evaluation results that not only direct the further development of the project, but also allow us to discuss its overall feasibility.", "file_path": "./data/paper/Eckert/3197026.3197050.pdf", "title": "Citation Database: Enabling Libraries to Contribute to an Open and Interconnected Citation Graph", "abstract": "Citations play a crucial role in the scientific discourse, in information retrieval, and in bibliometrics. Many initiatives are currently promoting the idea of having free and open citation data. Creation of citation data, however, is not part of the cataloging workflow in libraries nowadays. In this paper, we present our project Linked Open Citation Database, in which we design distributed processes and a system infrastructure based on linked data technology. The goal is to show that efficiently cataloging citations in libraries using a semi-automatic approach is possible. We specifically describe the current state of the workflow and its implementation. We show that we could significantly improve the automatic reference extraction that is crucial for the subsequent data curation. We further give insights on the curation and linking process and provide evaluation results that not only direct the further development of the project, but also allow us to discuss its overall feasibility.", "keywords": ["Information systems \u2192 Digital libraries and archives", "RESTful web services", "Resource Description Framework (RDF)", "citation data", "library workflows", "linked open data", "editorial system", "automatic reference extraction"], "author": "Eckert"}, {"content": "Reverse-Transliteration of Hebrew script for Entity Disambiguation\nJudaicaLink is a novel domain-specific knowledge base for Jewish culture, history, and studies. JudaicaLink is built by extracting structured, multilingual knowledge from different sources and it is mainly used for contextualization and entity linking. One of the main challenges in the process of aggregating Jewish digital resources is the use of the Hebrew script. The proof of materials in German central cataloging systems is based on the conversion of the original script of the publication into the Latin script, known as Romanization. Many of our datasets, especially those from library catalogs, contain Hebrew authors' names and titles which are only in Latin script without their Hebrew script. Therefore, it is not possible to identify them in and link them to other corresponding Hebrew resources. To overcome this problem, we designed a reverse-transliteration model which reconstructs the Hebrew script from the Romanization and consequently makes the entities more accessible.", "file_path": "./data/paper/Eckert/3366030.3366099.pdf", "title": "Reverse-Transliteration of Hebrew script for Entity Disambiguation", "abstract": "JudaicaLink is a novel domain-specific knowledge base for Jewish culture, history, and studies. JudaicaLink is built by extracting structured, multilingual knowledge from different sources and it is mainly used for contextualization and entity linking. One of the main challenges in the process of aggregating Jewish digital resources is the use of the Hebrew script. The proof of materials in German central cataloging systems is based on the conversion of the original script of the publication into the Latin script, known as Romanization. Many of our datasets, especially those from library catalogs, contain Hebrew authors' names and titles which are only in Latin script without their Hebrew script. Therefore, it is not possible to identify them in and link them to other corresponding Hebrew resources. To overcome this problem, we designed a reverse-transliteration model which reconstructs the Hebrew script from the Romanization and consequently makes the entities more accessible.", "keywords": ["CCS CONCEPTS", "Information systems \u2192 Extraction, transformation and loading", "Digital libraries and archives", "Resource Description Framework (RDF) Linked Open Data, Retro-Conversion, Transcription, Writing Systems, Digital Library"], "author": "Eckert"}, {"content": "Cyberbullying Detection in Social Networks Using Deep Learning Based Models; A Reproducibility Study\nCyberbullying is a disturbing online misbehaviour with troubling consequences. It appears in different forms, and in most of the social networks, it is in textual format. Automatic detection of such incidents requires intelligent systems. Most of the existing studies have approached this problem with conventional machine learning models and the majority of the developed models in these studies are adaptable to a single social network at a time. In recent studies, deep learning based models have found their way in the detection of cyberbullying incidents, claiming that they can overcome the limitations of the conventional models, and improve the detection performance. In this paper, we investigate the findings of a recent literature in this regard. We successfully reproduced the findings of this literature and validated their findings using the same datasets, namely Wikipedia, Twitter, and Formspring, used by the authors. Then we expanded our work by applying the developed methods on a new YouTube dataset (~54k posts by ~4k users) and investigated the performance of the models in new social media platforms. We also transferred and evaluated the performance of the models trained on one platform to another platform. Our findings show that the deep learning based models outperform the machine learning models previously applied to the same YouTube dataset. We believe that the deep learning based models can also benefit from integrating other sources of information and looking into the impact of profile information of the users in social networks.", "file_path": "./data/paper/Eckert/1812.08046.pdf", "title": "Cyberbullying Detection in Social Networks Using Deep Learning Based Models; A Reproducibility Study", "abstract": "Cyberbullying is a disturbing online misbehaviour with troubling consequences. It appears in different forms, and in most of the social networks, it is in textual format. Automatic detection of such incidents requires intelligent systems. Most of the existing studies have approached this problem with conventional machine learning models and the majority of the developed models in these studies are adaptable to a single social network at a time. In recent studies, deep learning based models have found their way in the detection of cyberbullying incidents, claiming that they can overcome the limitations of the conventional models, and improve the detection performance. In this paper, we investigate the findings of a recent literature in this regard. We successfully reproduced the findings of this literature and validated their findings using the same datasets, namely Wikipedia, Twitter, and Formspring, used by the authors. Then we expanded our work by applying the developed methods on a new YouTube dataset (~54k posts by ~4k users) and investigated the performance of the models in new social media platforms. We also transferred and evaluated the performance of the models trained on one platform to another platform. Our findings show that the deep learning based models outperform the machine learning models previously applied to the same YouTube dataset. We believe that the deep learning based models can also benefit from integrating other sources of information and looking into the impact of profile information of the users in social networks.", "keywords": ["Deep Learning", "Online Bullying", "Neural Networks", "Social Networks", "Transfer Learning", "YouTube"], "author": "Eckert"}, {"content": "Investigating the Role of Argumentation in the Rhetorical Analysis of Scientific Publications with Neural Multi-Task Learning Models\nExponential growth in the number of scientific publications yields the need for effective automatic analysis of rhetorical aspects of scientific writing. Acknowledging the argumentative nature of scientific text, in this work we investigate the link between the argumentative structure of scientific publications and rhetorical aspects such as discourse categories or citation contexts. To this end, we (1) augment a corpus of scientific publications annotated with four layers of rhetoric annotations with argumentation annotations and (2) investigate neural multi-task learning architectures combining argument extraction with a set of rhetorical classification tasks. By coupling rhetorical classifiers with the extraction of argumentative components in a joint multi-task learning setting, we obtain significant performance gains for different rhetorical analysis tasks.", "file_path": "./data/paper/Eckert/D18-1370.pdf", "title": "Investigating the Role of Argumentation in the Rhetorical Analysis of Scientific Publications with Neural Multi-Task Learning Models", "abstract": "Exponential growth in the number of scientific publications yields the need for effective automatic analysis of rhetorical aspects of scientific writing. Acknowledging the argumentative nature of scientific text, in this work we investigate the link between the argumentative structure of scientific publications and rhetorical aspects such as discourse categories or citation contexts. To this end, we (1) augment a corpus of scientific publications annotated with four layers of rhetoric annotations with argumentation annotations and (2) investigate neural multi-task learning architectures combining argument extraction with a set of rhetorical classification tasks. By coupling rhetorical classifiers with the extraction of argumentative components in a joint multi-task learning setting, we obtain significant performance gains for different rhetorical analysis tasks.", "keywords": [], "author": "Eckert"}, {"content": "Citation-Based Summarization of Scientific Articles Using Semantic Textual Similarity\nThe number of publications is rapidly growing and it is essential to enable fast access and analysis of relevant articles. In this paper, we describe a set of methods based on measuring semantic textual similarity, which we use to semantically analyze and summarize publications through other publications that cite them. We report the performance of our approach in the context of the third CL-SciSumm shared task and show that our system performs favorably to competing systems in terms of produced summaries.", "file_path": "./data/paper/Eckert/BIRNDL_2017_paper_7 (6).pdf", "title": "Citation-Based Summarization of Scientific Articles Using Semantic Textual Similarity", "abstract": "The number of publications is rapidly growing and it is essential to enable fast access and analysis of relevant articles. In this paper, we describe a set of methods based on measuring semantic textual similarity, which we use to semantically analyze and summarize publications through other publications that cite them. We report the performance of our approach in the context of the third CL-SciSumm shared task and show that our system performs favorably to competing systems in terms of produced summaries.", "keywords": ["Scientific Publication Mining", "Scientific Summarization", "Information Retrieval", "Text Classification"], "author": "Eckert"}, {"content": "Investigating Convolutional Networks and Domain-Specific Embeddings for Semantic Classification of Citations\nCitation graphs and indices underpin most bibliometric analyses. However, measures derived from citation graphs do not provide insights into qualitative aspects of scientic publications. In this work, we aim to semantically characterize citations in terms of polarity and purpose. We frame polarity and purpose detection as classication tasks and investigate the performance of convolutional networks with general and domain-specic word embeddings on these tasks. Our best performing model outperforms previously reported results on a benchmark dataset by a wide margin.", "file_path": "./data/paper/Eckert/3127526.3127531.pdf", "title": "Investigating Convolutional Networks and Domain-Specific Embeddings for Semantic Classification of Citations", "abstract": "Citation graphs and indices underpin most bibliometric analyses. However, measures derived from citation graphs do not provide insights into qualitative aspects of scientic publications. In this work, we aim to semantically characterize citations in terms of polarity and purpose. We frame polarity and purpose detection as classication tasks and investigate the performance of convolutional networks with general and domain-specic word embeddings on these tasks. Our best performing model outperforms previously reported results on a benchmark dataset by a wide margin.", "keywords": ["\u2022Computing methodologies ! Natural language processing", "Neural networks", "\u2022Information storage and retrieval ! Content analysis and indexing", "Citation Polarity, Citation Purpose, Classication, Support Vector Machine, Convolutional Neural Network, Word Embeddings"], "author": "Eckert"}, {"content": "A Web Application to Search a Large Repository of Taxonomic Relations from the Web\nTaxonomic relations (also known as isa or hypernymy relations) represent one of the key building blocks of knowledge bases and foundational ontologies and provide a fundamental piece of information for many text understanding applications. Despite the availability of very large knowledge bases, however, some Natural Language Processing and Semantic Web applications (e.g., Ontology Learning) still require automatic isa relation harvesting techniques to cope with the coverage of domain-specific and long-tail terms. In this paper, we present a web application to directly query a very large repository of isa relations automatically extracted from the Common Crawl (the largest publicly available crawl of the Web). Our resource can be also downloaded for research purposes and accessed programmatically (we additionally release a Java application programming interface for this purpose).", "file_path": "./data/paper/Eckert/paper58.pdf", "title": "A Web Application to Search a Large Repository of Taxonomic Relations from the Web", "abstract": "Taxonomic relations (also known as isa or hypernymy relations) represent one of the key building blocks of knowledge bases and foundational ontologies and provide a fundamental piece of information for many text understanding applications. Despite the availability of very large knowledge bases, however, some Natural Language Processing and Semantic Web applications (e.g., Ontology Learning) still require automatic isa relation harvesting techniques to cope with the coverage of domain-specific and long-tail terms. In this paper, we present a web application to directly query a very large repository of isa relations automatically extracted from the Common Crawl (the largest publicly available crawl of the Web). Our resource can be also downloaded for research purposes and accessed programmatically (we additionally release a Java application programming interface for this purpose).", "keywords": ["Hearst patterns", "hypernym extraction", "information extraction and Natural Language Processing techniques for the Semantic Web"], "author": "Eckert"}, {"content": "Easy and complex: new perspectives for metadata modeling using RDF-star and Named Graphs\nThe Resource Description Framework is well-established as a lingua franca for data modeling and is designed to integrate heterogeneous data at instance and schema level using statements. While RDF is conceptually simple, data models nevertheless get complex, when complex data needs to be represented. Additional levels of indirection with intermediate resources instead of simple properties lead to higher barriers for prospective users of the data. Based on three patterns, we argue that shifting information to a meta-level can not only be used to (1) provide provenance information, but can also help to (2) maintain backwards compatibility for existing models, and to (3) reduce the complexity of a data model. There are, however, multiple ways in RDF to use a meta-level, i.e., to provide additional statements about statements. With Named Graphs, there exists a well-established mechanism to describe groups of statements. Since its inception, however, it has been hard to make statements about single statements. With the introduction of RDF-star, a new way to provide data about single statements is now available. We show that the combination of RDF-star and Named Graphs is a viable solution to express data on a meta-level and propose that this meta-level should be used as first class citizen in data modeling.", "file_path": "./data/paper/Eckert/2211.16195.pdf", "title": "Easy and complex: new perspectives for metadata modeling using RDF-star and Named Graphs", "abstract": "The Resource Description Framework is well-established as a lingua franca for data modeling and is designed to integrate heterogeneous data at instance and schema level using statements. While RDF is conceptually simple, data models nevertheless get complex, when complex data needs to be represented. Additional levels of indirection with intermediate resources instead of simple properties lead to higher barriers for prospective users of the data. Based on three patterns, we argue that shifting information to a meta-level can not only be used to (1) provide provenance information, but can also help to (2) maintain backwards compatibility for existing models, and to (3) reduce the complexity of a data model. There are, however, multiple ways in RDF to use a meta-level, i.e., to provide additional statements about statements. With Named Graphs, there exists a well-established mechanism to describe groups of statements. Since its inception, however, it has been hard to make statements about single statements. With the introduction of RDF-star, a new way to provide data about single statements is now available. We show that the combination of RDF-star and Named Graphs is a viable solution to express data on a meta-level and propose that this meta-level should be used as first class citizen in data modeling.", "keywords": ["data modeling", "RDF-star", "Named Graphs", "meta-level"], "author": "Eckert"}, {"content": "X-SCITLDR: Cross-Lingual Extreme Summarization of Scholarly Documents\nThe number of scientific publications nowadays is rapidly increasing, causing information overload for researchers and making it hard for scholars to keep up to date with current trends and lines of work. Consequently, recent work on applying text mining technologies for scholarly publications has investigated the application of automatic text summarization technologies, including extreme summarization, for this domain. However, previous work has concentrated only on monolingual settings, primarily in English. In this paper, we fill this research gap and present an abstractive cross-lingual summarization dataset for four different languages in the scholarly domain, which enables us to train and evaluate models that process English papers and generate summaries in German, Italian, Chinese and Japanese. We present our new X-SCITLDR dataset for multilingual summarization and thoroughly benchmark different models based on a state-of-the-art multilingual pre-trained model, including a two-stage 'summarize and translate' approach and a direct cross-lingual model. We additionally explore the benefits of intermediate-stage training using English monolingual summarization and machine translation as intermediate tasks and analyze performance in zero-and few-shot scenarios.", "file_path": "./data/paper/Eckert/3529372.3530938.pdf", "title": "X-SCITLDR: Cross-Lingual Extreme Summarization of Scholarly Documents", "abstract": "The number of scientific publications nowadays is rapidly increasing, causing information overload for researchers and making it hard for scholars to keep up to date with current trends and lines of work. Consequently, recent work on applying text mining technologies for scholarly publications has investigated the application of automatic text summarization technologies, including extreme summarization, for this domain. However, previous work has concentrated only on monolingual settings, primarily in English. In this paper, we fill this research gap and present an abstractive cross-lingual summarization dataset for four different languages in the scholarly domain, which enables us to train and evaluate models that process English papers and generate summaries in German, Italian, Chinese and Japanese. We present our new X-SCITLDR dataset for multilingual summarization and thoroughly benchmark different models based on a state-of-the-art multilingual pre-trained model, including a two-stage 'summarize and translate' approach and a direct cross-lingual model. We additionally explore the benefits of intermediate-stage training using English monolingual summarization and machine translation as intermediate tasks and analyze performance in zero-and few-shot scenarios.", "keywords": ["CCS Concepts", "Computing methodologies \u2192 Natural language processing", "Natural language generation", "Language resources Scholarly document processing, Summarization, Multilinguality"], "author": "Eckert"}, {"content": "Interactive Exploration of Geospatial Network Visualization\nThis paper presents a tabletop visualization of relations between geo-positioned locations. We developed an interactive visualization, which enables users to visually explore a geospatial network of actors. The multitouch tabletop, and the large size of the interactive surface invite users to explore the visualization in semi-public spaces. For a case study on scientific collaborations between institutions, we applied and improved several existing techniques for a walk-up-and-use system aimed at scientists for a social setting at a conference. We describe our iterative design approach, our two implemented prototypes, and the lessons learnt from their creation. We conducted user evaluation studies at the two on-location demonstrations, which provide evidence of the prototype usability and usefulness, and its support for understanding the distribution and connectivity in a geospatial network.", "file_path": "./data/paper/Nagel/p557-nagel.pdf", "title": "Interactive Exploration of Geospatial Network Visualization", "abstract": "This paper presents a tabletop visualization of relations between geo-positioned locations. We developed an interactive visualization, which enables users to visually explore a geospatial network of actors. The multitouch tabletop, and the large size of the interactive surface invite users to explore the visualization in semi-public spaces. For a case study on scientific collaborations between institutions, we applied and improved several existing techniques for a walk-up-and-use system aimed at scientists for a social setting at a conference. We describe our iterative design approach, our two implemented prototypes, and the lessons learnt from their creation. We conducted user evaluation studies at the two on-location demonstrations, which provide evidence of the prototype usability and usefulness, and its support for understanding the distribution and connectivity in a geospatial network.", "keywords": ["geo-visualization", "tabletop interfaces", "human computer interaction", "exploration", "interactive maps ACM Classification H.5.2 [Information Interfaces And Presentation]: User Interfaces -Interaction styles"], "author": "Nagel"}, {"content": "Visually Analysing Urban Mobility: Results and Insights from Three Student Research Projects\nSince the digitalization of urban mobility, such systems generate large and heterogenous data sets. Handling these, however, is still a major challenge. To help people making sense of urban data, skills in data literacy are needed. In this paper, we present results and insights from a set of mobility visualization projects, students from computer science and design developed over one semester. The resulting prototypes visualize (a) user-generated bike trajectories, (b) population movement, and (c) public transit passengers. After a brief justification on why we find urban mobility a practical domain for teaching data literacy, we introduce the course and describe its set and setting. For each project, we report on the challenge, the data, the visualization prototype, and the impact these projects had. Our contribution is twofold: in the description of the problem-oriented course and its results, and in the reflection of our approach and our lessons learnt.", "file_path": "./data/paper/Nagel/s42489-020-00040-5.pdf", "title": "Visually Analysing Urban Mobility: Results and Insights from Three Student Research Projects", "abstract": "Since the digitalization of urban mobility, such systems generate large and heterogenous data sets. Handling these, however, is still a major challenge. To help people making sense of urban data, skills in data literacy are needed. In this paper, we present results and insights from a set of mobility visualization projects, students from computer science and design developed over one semester. The resulting prototypes visualize (a) user-generated bike trajectories, (b) population movement, and (c) public transit passengers. After a brief justification on why we find urban mobility a practical domain for teaching data literacy, we introduce the course and describe its set and setting. For each project, we report on the challenge, the data, the visualization prototype, and the impact these projects had. Our contribution is twofold: in the description of the problem-oriented course and its results, and in the reflection of our approach and our lessons learnt.", "keywords": ["Urban data", "Education", "Research-based learning", "Functional prototypes", "Visual analysis Visuelles Analysieren urbaner Mobilit\u00e4t -Ergebnisse und Erkenntnisse aus drei studentischen Forschungsprojekten"], "author": "Nagel"}, {"content": "Components of a Research 2.0 Infrastructure\nIn this paper, we investigate the components of a Research 2.0 infrastructure. We propose building blocks and their concrete implementation to leverage Research 2.0 practice and technologies in our field, including a publication feed format for exchanging publication data, a RESTful API to retrieve publication and Web 2.0 data, and a publisher suit for refining and aggregating data. We illustrate the use of this infrastructure with Research 2.0 application examples ranging from a Mash-Up environment, a mobile and multitouch application, thereby demonstrating the strength of this infrastructure.", "file_path": "./data/paper/Nagel/JointECTEL2010_6PagerV5.pdf", "title": "Components of a Research 2.0 Infrastructure", "abstract": "In this paper, we investigate the components of a Research 2.0 infrastructure. We propose building blocks and their concrete implementation to leverage Research 2.0 practice and technologies in our field, including a publication feed format for exchanging publication data, a RESTful API to retrieve publication and Web 2.0 data, and a publisher suit for refining and aggregating data. We illustrate the use of this infrastructure with Research 2.0 application examples ranging from a Mash-Up environment, a mobile and multitouch application, thereby demonstrating the strength of this infrastructure.", "keywords": ["research 2.0", "infrastructure", "mash-ups", "#Res2TEL 1"], "author": "Nagel"}, {"content": "Learning Dashboards & Learnscapes\nIn this paper, we briefly present our work on applications for 'learning analytics'. Our work ranges from dashboards on small mobile devices to learnscapes on large public displays. We capture and visualize traces of learning activities, in order to promote self-awareness and reflection, and to enable learners to define goals and track progress towards these goals. We identify HCI issues for this kind of applications.", "file_path": "./data/paper/Nagel/eist2012_submission_6-2.pdf", "title": "Learning Dashboards & Learnscapes", "abstract": "In this paper, we briefly present our work on applications for 'learning analytics'. Our work ranges from dashboards on small mobile devices to learnscapes on large public displays. We capture and visualize traces of learning activities, in order to promote self-awareness and reflection, and to enable learners to define goals and track progress towards these goals. We identify HCI issues for this kind of applications.", "keywords": ["learning analytics", "information visualization ACM Classification H.5.2. User Interfaces", "K.3.1. Computer uses in Education Design. Human Factors. Context"], "author": "Nagel"}, {"content": "Touching Transport -A Case Study on Visualizing Metropolitan Public Transit on Interactive Tabletops\nDue to recent technical developments, urban systems generate large and complex data sets. While visualizations have been used to make these accessible, often they are tailored to one specific group of users, typically the public or expert users. We present Touching Transport, an application that allows a diverse group of users to visually explore public transit data on a multi-touch tabletop. It provides multiple perspectives of the data and consists of three visualization modes conveying tempo-spatial patterns as map, time-series, and arc view. We exhibited our system publicly, and evaluated it in a lab study with three distinct user groups: citizens with knowledge of the local environment, experts in the domain of public transport, and non-experts with neither local nor domain knowledge. Our observations and evaluation results show we achieved our goals of both attracting visitors to explore the data while enabling gathering insights for both citizens and experts. We discuss the design considerations in developing our system, and describe our lessons learned in designing engaging tabletop visualizations.", "file_path": "./data/paper/Nagel/2598153.2598180.pdf", "title": "Touching Transport -A Case Study on Visualizing Metropolitan Public Transit on Interactive Tabletops", "abstract": "Due to recent technical developments, urban systems generate large and complex data sets. While visualizations have been used to make these accessible, often they are tailored to one specific group of users, typically the public or expert users. We present Touching Transport, an application that allows a diverse group of users to visually explore public transit data on a multi-touch tabletop. It provides multiple perspectives of the data and consists of three visualization modes conveying tempo-spatial patterns as map, time-series, and arc view. We exhibited our system publicly, and evaluated it in a lab study with three distinct user groups: citizens with knowledge of the local environment, experts in the domain of public transport, and non-experts with neither local nor domain knowledge. Our observations and evaluation results show we achieved our goals of both attracting visitors to explore the data while enabling gathering insights for both citizens and experts. We discuss the design considerations in developing our system, and describe our lessons learned in designing engaging tabletop visualizations.", "keywords": ["H.5.m [Information Interfaces and Presentation]: Misc Design, Human Factors data visualization", "multi-touch", "exhibition", "urban mobility"], "author": "Nagel"}, {"content": "Staged Analysis: From Evocative to Comparative Visualizations of Urban Mobility\nIn this paper we examine the concept of staged analysis through a case study on visualizing urban mobility exhibited in a public gallery space. Recently, many cities introduced bike-sharing in order to promote cycling among locals and visitors. We explore how citizens can be guided from evocative impressions of bicycling flows to comparative analysis of three bike-sharing systems. The main aim for visualizations in exhibition contexts is to encourage a shift from temporary interest to deeper insight into a complex phenomenon. To pursue this ambition we introduce cf. city flows, a comparative visualization environment of urban bike mobility designed to help citizens casually analyze three bike-sharing systems in the context of a public exhibition space. Multiple large screens show the space of flows in bike-sharing for three selected world cities: Berlin, London, and New York. Bike journeys are represented in three geospatial visualizations designed to be progressively more analytical, from animated trails to small-multiple glyphs. In this paper, we describe our design concept and process, the exhibition setup, and discuss some of the insights visitors gained while interacting with the visualizations.", "file_path": "./data/paper/Nagel/Staged_analysis_From_evocative_to_comparative_visualizations_of_Urban_mobility (1).pdf", "title": "Staged Analysis: From Evocative to Comparative Visualizations of Urban Mobility", "abstract": "In this paper we examine the concept of staged analysis through a case study on visualizing urban mobility exhibited in a public gallery space. Recently, many cities introduced bike-sharing in order to promote cycling among locals and visitors. We explore how citizens can be guided from evocative impressions of bicycling flows to comparative analysis of three bike-sharing systems. The main aim for visualizations in exhibition contexts is to encourage a shift from temporary interest to deeper insight into a complex phenomenon. To pursue this ambition we introduce cf. city flows, a comparative visualization environment of urban bike mobility designed to help citizens casually analyze three bike-sharing systems in the context of a public exhibition space. Multiple large screens show the space of flows in bike-sharing for three selected world cities: Berlin, London, and New York. Bike journeys are represented in three geospatial visualizations designed to be progressively more analytical, from animated trails to small-multiple glyphs. In this paper, we describe our design concept and process, the exhibition setup, and discuss some of the insights visitors gained while interacting with the visualizations.", "keywords": ["Flow maps", "comparative visualization", "urban mobility", "bike-sharing", "geovisualization", "storytelling", "public displays"], "author": "Nagel"}, {"content": "maeve -An Interactive Tabletop Installation for Exploring Background Information in Exhibitions\nThis paper introduces the installation maeve: a novel approach to present background information in exhibitions in a highly interactive, tangible and sociable manner. Visitors can collect paper cards representing the exhibits and put them on an interactive surface to display associated concepts and relations to other works. As a result, users can explore both the unifying themes of the exhibition as well as individual characteristics of exhibits. On basis of metadata schemata developed in the MACE (Metadata for Architectural Contents in Europe) project, the system has been put to use the Architecture Biennale to display the entries to the Everyville student competition", "file_path": "./data/paper/Nagel/maeve_An_Interactive_Tabletop_Installatio.pdf", "title": "maeve -An Interactive Tabletop Installation for Exploring Background Information in Exhibitions", "abstract": "This paper introduces the installation maeve: a novel approach to present background information in exhibitions in a highly interactive, tangible and sociable manner. Visitors can collect paper cards representing the exhibits and put them on an interactive surface to display associated concepts and relations to other works. As a result, users can explore both the unifying themes of the exhibition as well as individual characteristics of exhibits. On basis of metadata schemata developed in the MACE (Metadata for Architectural Contents in Europe) project, the system has been put to use the Architecture Biennale to display the entries to the Everyville student competition", "keywords": ["Metadata", "visualization", "concept networks", "tangible interface", "exhibition", "user experience"], "author": "Nagel"}, {"content": "Touching Transport -A Case Study on Visualizing Metropolitan Public Transit on Interactive Tabletops\nDue to recent technical developments, urban systems generate large and complex data sets. While visualizations have been used to make these accessible, often they are tailored to one specific group of users, typically the public or expert users. We present Touching Transport, an application that allows a diverse group of users to visually explore public transit data on a multi-touch tabletop. It provides multiple perspectives of the data and consists of three visualization modes conveying tempo-spatial patterns as map, time-series, and arc view. We exhibited our system publicly, and evaluated it in a lab study with three distinct user groups: citizens with knowledge of the local environment, experts in the domain of public transport, and non-experts with neither local nor domain knowledge. Our observations and evaluation results show we achieved our goals of both attracting visitors to explore the data while enabling gathering insights for both citizens and experts. We discuss the design considerations in developing our system, and describe our lessons learned in designing engaging tabletop visualizations.", "file_path": "./data/paper/Nagel/Nagel - Touching Transport - AVI14.pdf", "title": "Touching Transport -A Case Study on Visualizing Metropolitan Public Transit on Interactive Tabletops", "abstract": "Due to recent technical developments, urban systems generate large and complex data sets. While visualizations have been used to make these accessible, often they are tailored to one specific group of users, typically the public or expert users. We present Touching Transport, an application that allows a diverse group of users to visually explore public transit data on a multi-touch tabletop. It provides multiple perspectives of the data and consists of three visualization modes conveying tempo-spatial patterns as map, time-series, and arc view. We exhibited our system publicly, and evaluated it in a lab study with three distinct user groups: citizens with knowledge of the local environment, experts in the domain of public transport, and non-experts with neither local nor domain knowledge. Our observations and evaluation results show we achieved our goals of both attracting visitors to explore the data while enabling gathering insights for both citizens and experts. We discuss the design considerations in developing our system, and describe our lessons learned in designing engaging tabletop visualizations.", "keywords": ["H.5.m [Information Interfaces and Presentation]: Misc Design, Human Factors data visualization", "multi-touch", "exhibition", "urban mobility"], "author": "Nagel"}, {"content": "An Initial Visual Analysis of German City Dashboards\nCity dashboards are powerful tools for quickly understanding various urban phenomena through visualizing urban data using various techniques. In this paper, we investigate the common data sets used, and the most frequently employed visualization techniques in city dashboards. We reviewed 16 publicly available dashboards from 42 cities that are part of German smart city programs and have a high level of digitization. Through analysis of the visualization techniques used, we present our results visually and discuss our findings.", "file_path": "./data/paper/Nagel/013-015.pdf", "title": "An Initial Visual Analysis of German City Dashboards", "abstract": "City dashboards are powerful tools for quickly understanding various urban phenomena through visualizing urban data using various techniques. In this paper, we investigate the common data sets used, and the most frequently employed visualization techniques in city dashboards. We reviewed 16 publicly available dashboards from 42 cities that are part of German smart city programs and have a high level of digitization. Through analysis of the visualization techniques used, we present our results visually and discuss our findings.", "keywords": [], "author": "Nagel"}, {"content": "Staged Analysis: From Evocative to Comparative Visualizations of Urban Mobility\nIn this paper we examine the concept of staged analysis through a case study on visualizing urban mobility exhibited in a public gallery space. Recently, many cities introduced bike-sharing in order to promote cycling among locals and visitors. We explore how citizens can be guided from evocative impressions of bicycling flows to comparative analysis of three bike-sharing systems. The main aim for visualizations in exhibition contexts is to encourage a shift from temporary interest to deeper insight into a complex phenomenon. To pursue this ambition we introduce cf. city flows, a comparative visualization environment of urban bike mobility designed to help citizens casually analyze three bike-sharing systems in the context of a public exhibition space. Multiple large screens show the space of flows in bike-sharing for three selected world cities: Berlin, London, and New York. Bike journeys are represented in three geospatial visualizations designed to be progressively more analytical, from animated trails to small-multiple glyphs. In this paper, we describe our design concept and process, the exhibition setup, and discuss some of the insights visitors gained while interacting with the visualizations.", "file_path": "./data/paper/Nagel/Staged_analysis_From_evocative_to_comparative_visualizations_of_Urban_mobility.pdf", "title": "Staged Analysis: From Evocative to Comparative Visualizations of Urban Mobility", "abstract": "In this paper we examine the concept of staged analysis through a case study on visualizing urban mobility exhibited in a public gallery space. Recently, many cities introduced bike-sharing in order to promote cycling among locals and visitors. We explore how citizens can be guided from evocative impressions of bicycling flows to comparative analysis of three bike-sharing systems. The main aim for visualizations in exhibition contexts is to encourage a shift from temporary interest to deeper insight into a complex phenomenon. To pursue this ambition we introduce cf. city flows, a comparative visualization environment of urban bike mobility designed to help citizens casually analyze three bike-sharing systems in the context of a public exhibition space. Multiple large screens show the space of flows in bike-sharing for three selected world cities: Berlin, London, and New York. Bike journeys are represented in three geospatial visualizations designed to be progressively more analytical, from animated trails to small-multiple glyphs. In this paper, we describe our design concept and process, the exhibition setup, and discuss some of the insights visitors gained while interacting with the visualizations.", "keywords": ["Flow maps", "comparative visualization", "urban mobility", "bike-sharing", "geovisualization", "storytelling", "public displays"], "author": "Nagel"}, {"content": "Unfolding -A library for interactive maps\nVisualizing data with geo-spatial properties has become more important and prevalent due to the wide spread dissemination of devices, sensors, databases, and services with references to the physical world. Yet, with existing tools it is often difficult to create interactive geovisualizations tailored for a particular domain or a specific dataset. We present Unfolding, a library for interactive maps and data visualization. Unfolding provides an API for designers to quickly create and customize geo-visualizations. In this paper, we describe the design criteria, the development process, and the functionalities of Unfolding. We demonstrate its versatility in use through a collection of examples. Results from a user survey suggests programmers find the library easy to learn and to use.", "file_path": "./data/paper/Nagel/unfolding-v0.9.pdf", "title": "Unfolding -A library for interactive maps", "abstract": "Visualizing data with geo-spatial properties has become more important and prevalent due to the wide spread dissemination of devices, sensors, databases, and services with references to the physical world. Yet, with existing tools it is often difficult to create interactive geovisualizations tailored for a particular domain or a specific dataset. We present Unfolding, a library for interactive maps and data visualization. Unfolding provides an API for designers to quickly create and customize geo-visualizations. In this paper, we describe the design criteria, the development process, and the functionalities of Unfolding. We demonstrate its versatility in use through a collection of examples. Results from a user survey suggests programmers find the library easy to learn and to use.", "keywords": ["toolkits", "maps", "geovisualization", "information visualization", "interaction design", "programming"], "author": "Nagel"}, {"content": "Muse: Visualizing the origins and connections of institutions based on co-authorship of publications\nThis paper introduces Muse, an interactive visualization of publications to explore the collaborations between institutions. For this, the data on co-authorship is utilized, as these signify an existing level of collaboration. The affiliations of authors are geo-located, resulting in relations not only among institutions, but also between regions and countries. We explain our ideas behind the visualization and the interactions, and briefly describe the data processing and the implementation of the working prototype. The prototype focuses on a visualization for large tabletop displays, enabling multiple users to explore their personal networks, as well as emerging patterns in shared networks within a collaborative public setting. For the prototype we used the publication data of the EC-TEL conference.", "file_path": "./data/paper/Nagel/Muse_Visualizing_the_origins_and_connect.pdf", "title": "Muse: Visualizing the origins and connections of institutions based on co-authorship of publications", "abstract": "This paper introduces Muse, an interactive visualization of publications to explore the collaborations between institutions. For this, the data on co-authorship is utilized, as these signify an existing level of collaboration. The affiliations of authors are geo-located, resulting in relations not only among institutions, but also between regions and countries. We explain our ideas behind the visualization and the interactions, and briefly describe the data processing and the implementation of the working prototype. The prototype focuses on a visualization for large tabletop displays, enabling multiple users to explore their personal networks, as well as emerging patterns in shared networks within a collaborative public setting. For the prototype we used the publication data of the EC-TEL conference.", "keywords": ["geo-visualization", "tabletop", "research", "human computer interaction"], "author": "Nagel"}, {"content": "A visual approach for analyzing readmissions in intensive care medicine\nIntensive care units (ICUs) are under constant pressure to balance capacity. ICUs have a limited number of resources and therefore effective monitoring of planned and unplanned transfers is crucial. Transfers can result in critical readmissions, i.e. \"down\" transfers from ICUs to a normal ward or an intermediate care unit (IMC) and back \"up\" to an ICU within a short time span. In this work, we present a tool to visually analyze such readmissions. Patient transfer data is extracted from a clinical data warehouse via HL7-FHIR. The interactive prototype consists of a timeline of readmission cases, an aggregated view of transfer flows between wards, and histograms and calender heatmaps to show a set of key performance indicators. The aim of our tool is to support identifying peaks, discovering temporal patterns, comparing wards, and investigating potential causes. We report on our user centered approach, describe the data pipeline, present the visualization and interaction techniques of the functional prototype, and discuss initial feedback.", "file_path": "./data/paper/Nagel/A_visual_approach_for_analyzing_readmissions_in_intensive_care_medicine.pdf", "title": "A visual approach for analyzing readmissions in intensive care medicine", "abstract": "Intensive care units (ICUs) are under constant pressure to balance capacity. ICUs have a limited number of resources and therefore effective monitoring of planned and unplanned transfers is crucial. Transfers can result in critical readmissions, i.e. \"down\" transfers from ICUs to a normal ward or an intermediate care unit (IMC) and back \"up\" to an ICU within a short time span. In this work, we present a tool to visually analyze such readmissions. Patient transfer data is extracted from a clinical data warehouse via HL7-FHIR. The interactive prototype consists of a timeline of readmission cases, an aggregated view of transfer flows between wards, and histograms and calender heatmaps to show a set of key performance indicators. The aim of our tool is to support identifying peaks, discovering temporal patterns, comparing wards, and investigating potential causes. We report on our user centered approach, describe the data pipeline, present the visualization and interaction techniques of the functional prototype, and discuss initial feedback.", "keywords": ["patient transfers", "intensive care units", "temporal sequence visualization", "flow visualization"], "author": "Nagel"}, {"content": "Venice Unfolding: A Tangible User Interface for Exploring Faceted Data in a Geographical Context\nWe introduce Venice Unfolding, a case study on tangible geo-visualization on an interactive tabletop to enable the exploration of architectural projects in Venice. Our tangible user interface consists of a large display showing projects on a map, and a polyhedral object to browse these data interactively by selecting and filtering various metadata facets. In this paper we describe a prototype employing new methods to communicate territorial data in visual and tangible ways. The object reduces the barrier between the physical world and virtual data, and eases the understanding of faceted geographical data, enabling urban planners and citizens alike to participate in the discovery and analysis of information referring to the physical world.", "file_path": "./data/paper/Nagel/1868914.1869019-2.pdf", "title": "Venice Unfolding: A Tangible User Interface for Exploring Faceted Data in a Geographical Context", "abstract": "We introduce Venice Unfolding, a case study on tangible geo-visualization on an interactive tabletop to enable the exploration of architectural projects in Venice. Our tangible user interface consists of a large display showing projects on a map, and a polyhedral object to browse these data interactively by selecting and filtering various metadata facets. In this paper we describe a prototype employing new methods to communicate territorial data in visual and tangible ways. The object reduces the barrier between the physical world and virtual data, and eases the understanding of faceted geographical data, enabling urban planners and citizens alike to participate in the discovery and analysis of information referring to the physical world.", "keywords": ["Tangible interaction", "tabletop interface", "geo visualization", "urban planning", "multi-touch", "visual browsing", "faceted data ACM Classification H5.2. [User Interfaces]: Interaction styles", "Input devices and strategies", "Prototyping"], "author": "Nagel"}, {"content": "Supporting Medical Personnel at Analyzing Chronic Lung Diseases with Interactive Visualizations\nFigure 1: Partial view of our prototype showing spirometry measures. Each line chart depicts the progression over time, color-coded by severity. The dates are presented as a timeline, while the pre-and post-treatment values of each exam can be analyzed via the slope charts.", "file_path": "./data/paper/Nagel/065-067.pdf", "title": "Supporting Medical Personnel at Analyzing Chronic Lung Diseases with Interactive Visualizations", "abstract": "Figure 1: Partial view of our prototype showing spirometry measures. Each line chart depicts the progression over time, color-coded by severity. The dates are presented as a timeline, while the pre-and post-treatment values of each exam can be analyzed via the slope charts.", "keywords": ["CCS Concepts", "Human-centered computing \u2192 Visualization systems and tools", "Empirical studies in visualization", "Visual analytics"], "author": "Nagel"}, {"content": "Traffic Origins: A Simple Visualization Technique to Support Traffic Incident Analysis\nFigure 1: Traffic incidents highlighted using the Traffic Origins approach. When an incident occurs, its location is marked by an expanding circle that reveals traffic conditions in the immediate vicinity.", "file_path": "./data/paper/Nagel/trafficOrigins_PacificVis_2873a316.pdf", "title": "Traffic Origins: A Simple Visualization Technique to Support Traffic Incident Analysis", "abstract": "Figure 1: Traffic incidents highlighted using the Traffic Origins approach. When an incident occurs, its location is marked by an expanding circle that reveals traffic conditions in the immediate vicinity.", "keywords": ["Software [H", "1", "2]: User/Machine Systems-Human factors Software [H", "5", "2]: Computer Graphics-Graphical User In-"], "author": "Nagel"}]