Research Development Toolkit

Authors

Introduction

Motivation – Heterogeneity

Software systems in a scientific context:

  • Large: Often tens to hundreds of software components
  • Heterogeneous: Hosting solutions, version control systems, programming languages, build systems, licenses, maintenance models, 3rd party/1st party, legacy components
  • Complex: Dependency structure, versions, variability

Researchers develop software in many different ways:

  • Everyone tries to use best suited technologies and process
  • Different non-functional requirements (documentation, maintainability, maturity, level of professionalism in development process)
  • Teams set up their own CI, sometimes repository hosting – centralized IT services focus on other aspects

Introduction – Example System

cogimon-core-ros-nightly.png

Introduction – Goal

Bringing together multiple usually disparate aspects:

  • Organizational, social and historical aspects of projects
  • Versioning (across version control systems and repository hosting solutions)
  • Build system-level dependencies (across build systems and programming languages)

to facilitate

  • Construction and testing
  • Deployment
  • (Re-)use and reproducibility
  • Documentation
  • Dissemination

of large, heterogeneous, complex research software system

Agenda for this Presentation

  • General approach
    • System description model and language
    • Dependency and metadata analysis
    • Repository server, collaboration process
    • Operations
    • Reproducibility of software, experiments and publications
    • Best practices
  • Specific aspects
    • Build generator tool
    • Bootstrapping
    • Docker slave configuration
    • Research software catalog

Description Language

Description Language – Concepts

  • Project in this context:
    • Logical, organizational concept
    • Has a history, has versions, one or more manifestations (Example: RSB middleware used to live in a Redmine, now GitHub)
    • Cannot be built or executed
  • Project Version in this context:
    • Concrete, consists of source code artifacts
    • Often a particular revision in a particular repository
    • Can be built, producing one (or more) components
  • Distribution in this context:
    • Collection of project versions that can be built, deployed and used together
    • Can be built, producing a system consisting of components

Domain: Research Software Systems

Core domain concepts:

  • Project, Project Version, Distribution
  • System, Component

Other important (domain) concepts:

  • Dependency
  • Required, provided feature
  • Target platform
  • Variability
  • Composition
  • Generalization
  • Build step description
  • Metadata
  • Person, Role

Simplified Meta Model

meta-model.png

Recipes – Concrete Syntax

  • Recipes describe instances of metamodel concepts:
    • Project recipes: Projects and project versions
    • Distribition recipes: Distributions
    • Template recipes: Templates and aspects
    • Person recipes: People
  • Recipe syntax is based on YAML

    1: # Comments! Take that JSON!
    2: scalar: |
    3:   Long text with "" and '' and even \
    4: list:
    5:   - first
    6:   - second
    7: mapping:
    8:   key: value
    
  • Each recipe kind has a schema which, among other things, organizes the recipe into sections:

    catalog:
      
    variables:
      
    include:
      
    versions:
      …
    
  • Variable substitution syntax
    • Scalar reference: ${NAME|DEFAULT}
    • Splicing reference: @{NAME|DEFAULT}
    • Delegation: ${next-value|DEFAULT}

Recipes – Example

Project Recipe rsb-cpp.project

 1: templates:                        # Generalization
 2: - github
 3: - cmake-cpp
 4: 
 5: variables:
 6: recipe.maintainer:                # People
 7: - Jan Moringen <jmoringe@techfak.uni-bielefeld.de>
 8: access: public                    # Metadata
 9: 
10: github.user: open-rsx             # Repository
11: github.project: rsb-cpp
12: 
13: branches: [ master ]              # Minimal specification of versions

Distribution Recipe my-distribution.distribution

 1: include:
 2: - other-distribution              # Composition
 3: 
 4: versions:
 5: - name: rsb-cpp
 6:   versions:
 7:   - version: master
 8:   - parameters:                   # Variability
 9:     cmake.options:
10:     - '@{next-value|[]}'
11:     - CMAKE_BUILD_TYPE=Debug
12: - rsb-python@master

Recipes – Repository

One aspect of the Cognitive Interaction Toolkit is a shared repository of recipes describing software projects and software systems:

Recipes 1580
├─Project Recipes 1380
└─Distribution Recipes 200
Commits 9000
Contributors 100

Recipes – Demo

Dependency and Metadata Analysis

Automatic Analysis – Motivation

Concise recipes are enabled by automatic analysis.

By inspecting a particular revision in the repository associated to a project version, automatically determine the following information (so recipe authors do not have to explicitly declare it):

Dependencies
  • Provided features (ideally with versions and scope)
  • Required features (ideally with versions and scope)
Metadata
  • People (authors, maintainers, committers)
  • License(s)
  • Description
  • Access restrictions
Build Steps
  • Names of modules
  • Produced artifacts

Automatic Analysis – Dependency Model

  • Feature: triple (nature, target[, version])
  • Project versions provide features (usually versioned)
  • Project versions require features (versioned or unversioned)
  • System packages provide features

requires-provides.png

Automatic Analysis – Examples

CMake

1: project(myproject VERSION 1.2       # provides cmake:myproject:1.2
2:                   LANGUAGES C C++)  # metadata
3: 
4: find_package(alibrary 1.0 REQUIRED) # requires cmake:alibrary:1.0
5: 
6: pkg_search_modules(another_library) # requires pkg-config:another_library

Maven

 1: <project>
 2:  3:   <licenses>…</licenses>         <!-- metadata -->
 4:   <organization>…</organization>
 5:  6:   <groupId>open-rsx</groupId>    <!-- provides maven:open-rsx/rsb:0.18 -->
 7:   <artifactId>rsb</artifactId>
 8:   <version>0.18</version>
 9: 10:   <dependencies>
11:     <dependency>                 <!-- requires maven:junit/junit:1.0 -->
12:       <groupId>junit</groupId>
13:       <artifactId>junit</artifactId>
14:       <version>1.0</version>
15:     </dependency>
16:   </dependencies>
17: 18: </project>

Python Setuptools

 1: setup(name        = 'rsb',             # provides setuptools:rsb:0.18
 2:       version     = '0.18',
 3: 
 4:       description = "Event-driven …",  # metadata
 5:       author      = 'Johannes Wienke',
 6:       license     = 'LGPLv3+',
 7: 
 8:       install_requires = [             # requires setuptools:protobuf:2.8
 9:           'protobuf>=2.8'
10:       ]
11: 12:       )

ROS Package

 1: <package>
 2:   <name>robo_nav</name>                <!-- provides ros-package:robo_nav:0.1 -->
 3:   <version>0.1</version>
 4: 
 5:   <description>…</description>         <!-- metadata -->
 6:   <maintainer email="…">…</maintainer>
 7:   <author email="…">…</author>
 8:   <license>BSD</license>
 9: 
10:   <build_depend>                       <!-- requires ros-package:path_planner -->
11:     path_planner
12:   </build_depend>
13: </package>

Automatic Analysis – Limitations and Strategies

  • Accuracy of automatic analysis results depends on project type:
    • Complete and exact (Maven, ROS packages, pkg-config)
    • Potentially incomplete and Heuristic (CMake, Python setuptools)
  • Thus: recipe authors can help out:

    1: extra-requires:
    2:   - '@{next-value|[]}'
    3:   - nature:
    4:     target:
    5:     version:
    

    extra-requires, extra-provides take part in delegation and are merged with results of automatic analysis

  • Also an extension point: analysis strategies for new project natures can be added
  • Future work (proof-of-concept stage): limited interpretation for complicated cases for CMake and Python setuptools

Platform Requirements

platform-hierarchy.png

  • Declaration

     1: variables:
     2:   platform-requires:
     3:     ubuntu:
     4:       packages:
     5:       - '@{next-value}'
     6:       - gcc
     7:       bionic:
     8:         packages:
     9:         - '@{next-value}'
    10:         - clang
    
  • Merged according to platform hierarchy
  • Can be declared in
    • Project recipes
    • Template recipes. For example, all Maven projects need a JDK
  • Future improvement: further automation based on required features

Catalog

Automatic analysis and metadata is also useful for humans:

catalog-screenshot-project.png

catalog-screenshot-distribution.png

Analysis – Demo

  • Analyzing a repository

    ./build-generator analyze https://github.com/open-rsx/rsb-cpp > results.json
    xdg-open results.json
    
  • Computing platform requirements for a distribution

    ./build-generator platform-requirements                            \
      -p 'ubuntu xenial'                                               \
      PATH-TO-CITK/recipes/distributions/rsb-nightly.distribution
    

Build Generator

Build Generator – Overview

  • Starting point for users
  • Unified commandline interface for
    • Installing and configuring Jenkins instances
    • Working with recipes (validation, analysis, reports, …)
    • Generating Jenkins jobs
    • Generating other build processes (Makefile, DockerFile)
  • Single (large, 30 MB) binary
    • Reasonably portable across Linux systems
    • Few dependencies (OpenSSL's libssl being the annoying one)
  • Source code and binary releases on GitHub: https://github.com/rdtk/generator

Build Generator – Process

build-generator-process.png

End of "Generic Approach" Part of the Presentation

Now we address and discuss the following more specific aspects:

  1. Bootstrapping the aforementioned tools and processes on a user's computer
  2. Automatically configuring Docker-based Jenkins slaves for continuous integration

Bootstrapping

Bootstrapping – Motivation

Scenario:

  • User wants to try out, reproduce, develop or learn a continuous integration setup on their own machine
  • Running Linux, Docker installed
  • Doesn't want to modify or pollute system with lots of software
  • Doesn't want to manually apply a long list of setup steps

The requirements are thus:

  • Initial download and installation should be minimal
  • From there, maximum automation, minimal number of manual steps

Bootstrapping – Process

bootstrapping.png

Bootstrapping – Demo

  • Get generator binary from https://github.com/rdtk/generator/releases
  • Install Jenkins

    ./build-generator install-jenkins        \
      --profile local-docker                 \
      -u jan -p test -e a@b.c                \
      install-test
    # Takes between 60 and 300 seconds
    
    cd install-test
    ./start_jenkins
    
  • Clone recipe repository

    git clone -b wip-docker https://opensource.cit-ec.de/git/citk
    
  • Generate Distribution Jobs

    ./build-generator generate                                       \
      -u jan -p test                                                 \
      -D 'view.create?=true' -D view.name='Demo 1'                   \
      citk/distributions/build-generator-nightly.distribution
    
  • Result: https://localhost:8080/view/Demo 1/

Docker-based Jenkins Slaves

Docker Slaves – Motivation

Build generator supports different targets/modes:

  • Jenkins jobs for continuous integration, deployment, mixture of both
  • Makefile, DockerFile
  • Jenkins jobs using Docker slaves

Advantages of Docker slaves:

  • Full isolation between jobs and from host system (good for CI, reproducibility)
  • Install dependencies in container – no side-effects on host system
  • Build, test on different Linux platforms independent of host
  • Share runnable systems as Docker images

Docker Slaves – Process

docker-process-ci-initial-setup.png

docker-process-ci-scm-trigger.png

Docker Slaves – Demo

  • Generate Distribution Jobs

    ./build-generator generate                                 \
      -u jan -p test                                           \
      -D 'view.create?=true' -D view.name='Demo 2'             \
      -m ci-docker                                             \
      citk/distributions/cogimon-core-nightly.distribution
    
  • Result: https://localhost:8080/view/Demo 2/

Thank You for Your Attention!

Summary

  • General approach
    • System description model and language
    • Dependency and metadata analysis
  • Specific aspects
    • Build generator tool
    • Bootstrapping
    • Docker slave configuration
    • Research software catalog

Questions?

Backup Slides

Catalog – Demo

https://citkat-citec.bob.ci.cit-ec.net/browse/distribution/

Docker Slaves – Monolithic Process

docker-process-monolithic.png