CPP / C++ - Bookmarks

Table of Contents

1 Bookmarks

1.1 Places

1.1.1 General

Places:

Companies and Organizations

Blogs and homepages:

1.1.2 Stackoverflow tags

C++ - General

Boost Libraries

C Programming

Tooling

Concurrency and parallelism and HPC - High Performance Computing

  • multithreading
  • critical-section
  • thread-safety
  • thread-local-storage
  • pthreads - Posix Thread API which is shared by Unix-like operating systems (Linux, BSD, OSX, Anrdroid, iOS) and some RTOS real time operating systems.
  • atomic - memory-barriers - memory-model - data-race
  • GPU
  • CUDA - "CUDA is a parallel computing platform and programming model for Nvidia GPUs (Graphics Processing Units). CUDA provides an interface to Nvidia GPUs through a variety of programming languages, libraries, and APIs."
  • OpenCL - "OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors."
  • OpenMP - "OpenMP is a cross-platform multi-threading API which allows fine-grained task parallelization and synchronization using special compiler directives."
  • MPI - "MPI is the Message Passing Interface, a library for distributed memory parallel programming and the de facto standard method for using distributed memory clusters for high-performance technical computing. Questions about using MPI for parallel programming go under this tag; questions on, eg, installation problems with MPI implementations are best tagged with the appropriate implementation-specific tag, eg MPICH or OpenMPI."
  • SIMD - "Single instruction, multiple data (SIMD) is the concept of having each instruction operate on a small chunk or vector of data elements. CPU vector instruction sets include: x86 SSE and AVX, ARM NEON, and PowerPC AltiVec. To efficiently use SIMD instructions, data needs to be in structure-of-arrays form and should occur in longer streams. Naively "SIMD optimized" code frequently surprises by running slower than the original."

GUI - Graphical User Interface Frameworks

Debugging, diagnosing, Troubleshooting and faulty-analysis

Operating Systems APIs

Windows:

Posix + UNIX:

  • posix - Standardized C API shared by many Unix-like operating systems and some embedded RTOS Real Time Operating Systems.
  • UNIX + C
  • UNIX + C++
  • ptrace - "The ptrace() system call provides a means by which a parent process may observe and control the execution of another process, and examine and change its core image and registers."
  • XLIB - Lowe level user interface API for X Windows Systems X11 user interface common in Unix-like OSEs.
  • device-driver

Low Level: Assembly, ABI - Application Binary Interface

ISA - Instruction Set Architecture Cores

  • x86 => Dominant processor in IBM-PC Desktop and Servers.
  • x86-64
  • ARM - Advanced Risc Devices Architecture - The dominant ISA and CPU core inmobile devices, handsets, smart phones, consumer electronics, tablets, routers, printers, …, and high-end embedded systems.
  • ARMV7
  • ARMV8
  • ARM64

Network Protocols

System Programming

General

Peripherals and interfaces:

1.2 Code Standards and guidelines

Embedded Systems Coding Standards

1.3 Software Design

1.3.1 General

1.3.2 Useful C and C++ codebases for learning

High-quality C++ Code Bases

Interesting high-quality C-Code Bases

1.3.3 Lessons from other projects

Lessons and Techniques Extracted from Other Projects and Codes

1.3.5 Defensive Programming

1.3.6 C-Interface, FFI, DLL and Interoperability

Creating C-APIs, interfacing C++ with other languages and C-programming

1.3.7 Deployment and delivery

1.3.8 Exception and Error Handling

1.3.9 Header-only libraries examples

1.4 Unix, Posix and Linux C APIs and Interfaces

1.4.1 Asynchronous IO and IO Multiplexing APIs

Async IO APIs are widely used for building highly scalable servers, web servers and network application. They are widely used under-the-hood by frameworks such as Boost.ASIO, libuv (used by NodeJS) and also by Nginx web server. Those APIs allows handling multiple socket connections and multiple IO within a single thread.

1.4.2 Linux Pseudo-file systems

  • proc - process information pseudo-filesystem - Linux Programmer's Manual
    • "The proc filesystem is a pseudo-filesystem which provides an interface to kernel data structures. It is commonly mounted at /proc. Typically, it is mounted automatically by the system, but it can also be mounted manually using a command such as: …"
  • sysfs - a filesystem for exporting kernel objects - Linux Programmer's Manual
    • "The sysfs filesystem is a pseudo-filesystem which provides an interface to kernel data structures. (More precisely, the files and directories in sysfs provide a view of the kobject structures defined internally within the kernel.) The files under sysfs provide information about devices, kernel modules, filesystems, and other kernel components."

1.5 Optimization and HPC - High Performance Computing

1.5.1 Concepts Maps

Performance

  • Performance
  • Benchmark
  • Profiling
  • Memory Hierarchy
  • Microprocessor => Multicore Microprocessor

Cache Effects

  • CPU Cache Memory
    • Cache L1
    • Cache L2
    • Cache L3
    • Instrution Cache
    • Data Cache
  • Locality
    • Data locality
    • Temporal Locality
  • Cache effects
  • Cache misses
  • CPU Register Memory

Cache-Aware / Cache Oriented Programming

  • Cache-oblivion Algorithm
  • Cache-friendly code
  • Data-Oriented Design (Games)

Memory

  • Main Memory RAM / SDRAM
  • Memory alignment
  • SIMD => Single Instruction Multiple Data (Vectorization)

Parallelism

  • Parallel Computing
  • General Purpose GPU Computing
    • =>> CUDA (Nvidia)
    • =>> OpenCL (Khronos Group)
    • =>> Metal (Apple)
  • Distributed Computing
  • Supercomputing

1.5.2 Optimization

1.5.3 Cache Effects and Memory Alignment

1.5.4 Ulrich Drepper's Articles - What every programmer should know about memory

1.5.5 Data Oriented Design

Note: It would be better stated as "Cache-oriented design."

1.5.6 Profiling and benchmarking

1.5.8 SIMD and GPU

  • Intel Intrisics Guide (SIMD MMX, AVX and so on)
    • Brief: "The Intel Intrinsics Guide is an interactive reference tool for Intel intrinsic instructions, which are C style functions that provide access to many Intel instructions - including Intel® SSE, AVX, AVX-512, and more - without the need to write assembly code."
  • Paper - Debunking the 100X GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU [PDF]
    • Brief: "Recent advances in computing have led to an explosion in the amount of data being generated. Processing the ever-growing data in a timely manner has made throughput computing an important aspect for emerging applications. Our analysis of a set of important throughput computing kernels shows that there is an ample amount of parallelism in these kernels which makes them suitable for today’s multi-core CPUs and GPUs. In the past few years there have been many studies claiming GPUs deliver substantial speedups (between 10X and 1000X) over multi-core CPUs on these kernels. To understand where such large performance difference comes from, we perform a rigorous performance analysis and find that after applying optimizations appropriate for both CPUs and GPUs the performance gap between an Nvidia GTX280 processor and the Intel Core i7 960 processor narrows to only 2.5x on average. In this paper, we discuss optimization techniques for both CPU and GPU, analyze what architecture features contributed to performance differences between the two architectures, and recommend a set of architectural features which provide significant improvement in architectural efficiency for throughput kernels."
  • SIMD for C++ Developers (Konstantin) [PDF]
    • Brief: "Most of this article is focused on PC target platform. Some assembly knowledge is recommended, but not required, as the main focus of the article is SIMD intrinsics, supported by all modern C and C++ compilers. The support for them is cross-platform, same code will compile for Windows, Linux, legacy OSX (before ARM64 M1 switch), and couple recent generations of game consoles (except Nintendo which uses ARM processors)."
  • Into The Fray With SIMD - (Keith Slutskin and Kasima Tharpipitchai)
    • Brief: "This page is devoted to helping other students understand Single Instruction Multiple Data processors, using AltiVec and MMX as examples. This page aims to explore their development, their differences, and the impact that they've had on technology and the current industry. We also provide an area for students to test what they have learned. We hope that the reader will take away a better understanding of SIMD from this document. These comparisons between AltiVec and MMX can show what kinds of design choices must be made to move from theory to real world implementation. This might provide some insight into multiprocessing theory and technology."
  • An Introduction to Vectorization with the Intel Fortran Compiler [PDF]
  • Language Impact on Vectorization: Vector Programming in Fortran [PDF] - Zuze Institute Berlin.
  • Basics of Vectorization for Fortran Applications [PDF] - Inria / HAL archives ouvertes.
    • Brief: "This document presents a general view of vectorization (use of vector/SIMD instructions) for Fortran applications. The vectorization of code becomes increasingly important as most of the performance in current and future processor (in floating-point operations per second, FLOPS) depends on its use. Still, the automatic vectorization done by the compiler may not be an option in all cases due to dependencies, ambiguity, or sparse data access. In order to cover the basics of vectorization, this document explains the operation of vector instructions for different architectures, how code vectorization can be done, and how to test if your code has vectorized well. This document is intended mainly for use by developers and engineers with basic knowledge of computer architecture and programming in Fortran. It was designed to serve as a starting point for people working on the vectorization of applications, and does not address the subject in all its details."
  • NVidia - CUDA C++ Programming Guide - CUDA Toolkit Documentation
    • Brief: "The Graphics Processing Unit (GPU)1 provides much higher instruction throughput and memory bandwidth than the CPU within a similar price and power envelope. Many applications leverage these higher capabilities to run faster on the GPU than on the CPU (see GPU Applications). Other computing devices, like FPGAs, are also very energy efficient, but offer much less programming flexibility than GPUs. This difference in capabilities between the GPU and the CPU exists because they are designed with different goals in mind. While the CPU is designed to excel at executing a sequence of operations, called a thread, as fast as possible and can execute a few tens of these threads in parallel, the GPU is designed to excel at executing thousands of them in parallel (amortizing the slower single-thread performance to achieve greater throughput). The GPU is specialized for highly parallel computations and therefore designed such that more transistors are devoted to data processing rather than data caching and flow control."
  • ACM SIGARCH - SIMD Instructions Considered Harmful [ESSAY]
  • SIMD Programming CS 240A, 2017 [PDF]
  • Extending C++ for Explicit Data-Parallel Programming via SIMD Vector Types
  • ARM, x86 and RISC-V Microprocessors Compared - Erik Engheim
    • Brief: "A comparison of different design choices in the assembly language of three important microprocessor instruction-sets."
  • ARMv9: What is the Big Deal?. What is a Scalable Vector Extension… - by Erik Engheim
  • SIDM programming - Kenjiro Taura [PDF]
  • The C++ Scientist - Performance Considerations About SIMD Wrappers
  • The C++ Scientist - Writing C++ Wrappers for SIMD Intrinsics (1)
  • The C++ Scientist - Writing C++ Wrappers for SIMD Intrinsics (2)
  • The C++ Scientist - Writing C++ Wrappers for SIMD Intrinsics (3)
  • The C++ Scientist - Writing C++ Wrappers for SIMD Intrinsics (4)
  • The C++ Scientist - Writing C++ Wrappers for SIMD Intrinsics (5)
  • Intel: Parallel Programming and Optimization with Intel® Xeon Phi™ Coprocessors
    • Brief: "Created by Colfax International and Intel, and based on the book, Parallel Programming and Optimization with Intel® Xeon Phi™ Coprocessors, this short video series provides an overview of practical parallel programming and optimization with a focus on using the Intel® Many Integrated Core Architecture (Intel® MIC Architecture)."
  • Danluu - Assembly v. intrinsics
  • Joel Falco - Boost.SIMD [VIDEO]
  • W3C - SIMD operations in WebGPU for ML - by Mehmet Oguz Derin
  • ARM Developer - SIMD ISAs - Official ARM company page about ARM SIMD instructions for vectorization and faster array computation.

1.5.9 CPU Microarchitectures

General features of server, desktop, and embedded systems application processors

  • Desktop-grade CPU IC (Integrated Circuit) / Desktop-grade Processors
    • Smaller number of cores
    • Focus on balance between power efficiency and performance
    • Integrated GPU
    • Integrated MMU (Memory Management Unit) for supporting virtual memory
    • Lacks supoort for ECC RAM memory
  • Server-grade CPU IC (Integrated Circuit) / Server-grade Processors
    • Lacks integrated GPU (iGPU) => Cannot be used for games or anything that requires a GPU.
    • Has more cores per CPU IC (more threads)
    • Has larger caches
    • Focus on performance even at expense of more power usage.
    • Supports ECC (Error Correction Code) RAM memory
    • More expensive than Desktop-grade CPUs
    • Supports NUMA (Non Unified Memory Memory Access)
    • More PCI Lanes
    • Integrated MMU (Memory Management Unit) for supporting virtual memory
  • Embedded-systems grade processor (Application Processor)
    • Unlike, server-grade and desktop-grade processors, application processors are focused on embedded systems, consequentely, they are often low-power and may contain integrated peripherals such as PWM, I2C, UART and etc. Most of those processors are based on ARM, PowerPC or MIPS architecture.
    • Focus on reliability, low-power and controlling physical devices. Widely used on devices such as: smart phones, tablets, network routers, robots, appliances, printers, drones, security cameras, iOT (Internet Of Things) and so on. Those processors can also be found on Raspberry PI and Beaglebone Black develpment board.
    • Low power requirements for battery powered devices and operate without fans.
    • Operate at lower frequency than server-grade or desktop-grade processors for minimizing the power consumption.
    • May contain more than one CPU core.
    • May contain integrated flash memory for storing a firmware or bootloader.
    • May contain integrated GPU (Graphics Processing Unit)
    • May not contain a FPU (Floating Point Unit)
    • Often contain a MMU (Memory Management Unit) for supporting operating systems like Linux, BSD, Windows CE, VxWorks and so on.
    • Integrated peripherals for controlling external devices. These peripherals are often: GPIO (General Purpose IO) - digital IO; PWM (Pulse-Width Modulation), used for controlling power supplies, motors etc; event counters; SPI bus - Serial Peripheral Interface; I2C bus comunication peripheral; UART for RS232 communication; USB - Universal Serial Bus; JTAG support and more.
    • The main difference between an application processor and a microcontroller is that microcontrollers lack support for external RAM memory, external flash memory, external eeprom memory and MMU (Memory Management Unit), necessary for running operating systems, such as Linux or BSD, that need virtual memory.
    • Example:

General

Intel Microarchitectures

Intel CPUs (Central Processing Units)

AMD (Advanced Microdevices) Microarchitectures

AMD CPUs (Central Processing Units)

ARM Holdings Microarchitectures

Apple M1 / Apple Silicon (ARM-Based)

Note: based on ARM, but with architecture-license which allows customizing the microarchitecture.

1.5.10 Linear Algebra - BLAS/LAPACK/LINPACK

  • LAPACK - Linear Algebra Routines
  • LINPACK - Linear Algebra Routines
  • Solving System of Linear Equations with LAPACK (Apple)
  • Netlib - LAPACK - Linear Algebra Package
  • Scientific Computing Lecture 13: Linear Algebra with BLAS and LAPACK [VIDEO] - University of Toronto.
  • Matrix Expressions and BLAS/LAPACK; SciPy 2013 Presentation [VIDEO]
  • LAPACK Users' Guide - Relase 1.0
    • Brief: "LAPACK is a transportable library of Fortran 77 subroutines for solving the most common problems in numerical linear algebra: systems of linear equations, linear least squares problems, eigenvalue problems and singular value problems. LAPACK is designed to supersede LINPACK and EISPACK, principally by restructuring the software to achieve much greater efficiency on vector processors, high-performance superscalar'' workstations, and shared-memory multi-processors. LAPACK also adds extra functionality, uses some new or improved algorithms, and integrates the two sets of algorithms into a unified package. The LAPACK Users' Guide gives an informal introduction to the design of the algorithms and software, summarizes the contents of the package, describes conventions used in the software and documentation, and includes complete specifications for calling the routines. This edition of the Users' guide describes Release 1.0 of LAPACK."
  • Using LAPACK from C
    • Brief: "LAPACK and BLAS are originally written in Fortran and meant to be used in Fortran programs. Many vendors supply an optimised version of the LAPACK and BLAS libraries. Naturally, people who are programming in C or C++ and want to make use of the efficient implementation of the LAPACK/BLAS libraries. Here we will demonstrate how this can be done. At the end you find a complete working example, together with a script to run the same program using various interfaces and LAPACK libraries."
  • Template Numerical Toolkit - NIST
  • Using BLAS and LAPACK from Eigen
    • Brief: "Since Eigen version 3.3 and later, any F77 compatible BLAS or LAPACK libraries can be used as backends for dense matrix products and dense matrix decompositions. For instance, one can use Intel® MKL, Apple's Accelerate framework on OSX, OpenBLAS, Netlib LAPACK, etc. Do not miss this page for further discussions on the specific use of Intel® MKL (also includes VML, PARDISO, etc.) In order to use an external BLAS and/or LAPACK library, you must link you own application to the respective libraries and their dependencies. For LAPACK, you must also link to the standard Lapacke library, which is used as a convenient think layer between Eigen's C++ code and LAPACK F77 interface. Then you must activate their usage by defining one or multiple of the following macros (before including any Eigen's header): …"

1.5.11 Videos Selection

  • Intel - Architecture All Access: Modern CPU Architecture Part 1 – Key Concepts
  • Intel - Architecture All Access: Modern CPU Architecture Part 2 – Microarchitecture Deep Dive
    • Brief: "What is a CPU microarchitecture and what are the building blocks inside a CPU? Boyd Phelps, CVP of Client Engineering at Intel, takes us through key microarchitecture concepts like pipelines, speculation, branch prediction as well as the main building blocks in the front and back end of a CPU. Want to learn about the history of CPU architecture?"
  • Intel Video Series: Parallel Programming and Optimization with Intel® Xeon Phi™ Coprocessors
    • Brief: "Created by Colfax International and Intel, and based on the book, Parallel Programming and Optimization with Intel® Xeon Phi™ Coprocessors, this short video series provides an overview of practical parallel programming and optimization with a focus on using the Intel® Many Integrated Core Architecture (Intel® MIC Architecture)."
  • Intel - The Dawn of Standardizing Heterogenous Parallel Programming with DPC++ | HPC DevCon
    • Brief: "The variety of architectures has driven efforts to provide programming models and languages—some proprietary, and some driven by open communities and standards. None have fulfilled the promise of being “the one” that will enable the development community to preserve their programming investments by leveraging existing code to target other architectures with minimal changes. Learn more about Data Parallel C++ (DPC++) and its foundations on SYCL and C++ from James Reinders."
  • Inside Intel Compilers: Effective OpenMP SIMD Vectorization
    • Brief: "The relentless pace of Moore’s Law will lead to modern multi-core processors, coprocessors and GPU designs with extensive on-die integration of SIMD execution units on CPU and GPU cores to achieve better performance and power efficiency. To make efficient use of the underlying SIMD hardware, utilizing its wide vector registers and SIMD instructions such as Xeon Phi™, SIMD vectorization plays a key role of converting plain scalar C/C++/Fortran code into SIMD code that operating on vectors of data each holding one or more elements.Intel® Xeon processors and Xeon Phi™ coprocessors combine abundant thread parallelism with SIMD vector units. Efficiently exploiting SIMD vector units is one of the most important aspects in achieving high performance of the application code running on Intel® Xeon and Xeon Phi™.In this paper, we present Intel® compiler framework that supports OpenMP4.0/4.1 SIMD extensions, and also present a set of key vectorization techniques such as function vectorization, masking support, uniformity and linearity propagation, alignment optimization, gather/scatter optimization, remainder and peeling loop vectorization that are implemented inside the Intel® C/C++ and Fortran product compilers for Intel® Xeon processors and Xeon Phi™ coprocessors. A set of workloads from several application domains is employed to conduct the performance study of our SIMD vectorization techniques. The performance results show that we achieved up to 3x to ~12x performance gain on the Intel® Xeon processors and Xeon Phi™ coprocessors that illustrate how the power of compiler can be harnessed with minimum programmer efforts to enable effective SIMD parallelism. We also demonstrate a speedup ranging from ~100x to ~2000x with the seamless integration of SIMD vectorization and parallelization."
  • QCon London - Understanding CPU Microarchitecture to Increase Performance
    • Brief: "Alex Blewitt presents the microarchitecture of modern CPUs, showing how misaligned data can cause cache line false sharing, how branch prediction works and when it fails, and how to read CPU specific performance monitoring counters and use that in conjunction with tools like perf and toplev to discover where bottlenecks in CPU heavy code live."
  • CppCon 2018: Jefferson Amstutz "Compute More in Less Time Using C++ Simd Wrapper Libraries"
    • Brief: "Leveraging SIMD (Single Instruction Multiple Data) instructions are an important part of fully utilizing modern processors. However, utilizing SIMD hardware features in C++ can be difficult as it requires an understanding of how the underlying instructions work. Furthermore, there are not yet standardized ways to express C++ in ways which can guarantee such instructions are used to increase performance effectively. This talk aims to demystify how SIMD instructions can benefit the performance of applications and libraries, as well as demonstrate how a C++ SIMD wrapper library can greatly ease programmers in writing efficient, cross-platform SIMD code. While one particular library will be used to demonstrate elegant SIMD programming, the concepts shown are applicable to practically every C++ SIMD library currently available (e.g. boost.simd, tsimd, VC, dimsum, etc.), as well as the proposed SIMD extensions to the C++ standard library. Lastly, this talk will also seek to unify the greater topic of data parallelism in C++ by connecting the SIMD parallelism concepts demonstrated to other expressions of parallelism, such as SPMD/SIMT parallelism used in GPU computing."
  • CppCon 2016: Nicolas Guillemot "SPMD Programming Using C++ and ISPC"
    • Brief: "Love writing blazing fast SIMD code on CPU? Tired of dealing with ugly intrinsics and clumsy SIMD float4 classes? Has your compiler's auto-vectorization ever stopped working, causing unpredictable performance regressions? Wish you could write efficient SIMD code without locking yourself into a specific instruction set, while still taking advantage of a range of hardware from old desktops to new Intel Xeon Phi rigs? The solution is here, and it's called SPMD! SPMD is an elegant parallel programming technique for writing SIMD code, which automates the tedious constructions normally required when using intrinsics or assembly, breaks free of ties to specific instruction sets, and still allows you to work at the granularity of SIMD vectors when necessary."
  • Erwin Laure - Introduction to High Performance Computing
    • Brief: Shows several use-cases and applications of high-performance computing and parallel computing, including: physics, data mining, oil exploration, financial and economics modelling, wheather prediction and aerospace design.
  • Intro to Compiler Directives for Accelerators (OpenACC compiler directive)
    • Brif: "In this video from the University of Houston CACDS HPC Workshop, Ty McKercher from NVIDIA presents: Intro to Compiler Directives for Accelerators. Geoscientists need tools to allow them to rapidly develop algorithms that run fast on accelerators, while at the same time deliver portability and improve productivity. They demand a single source code, with no need to maintain multiple code paths, using a high-level approach that presents a low learning curve. OpenACC provides directives-based approaches to rapidly accelerating applications for GPUs and other parallel architectures. This talk will serve as an introduction to programming with OpenACC 2.0. Participants will learn how to apply compiler directives to an existing application to parallelize the application for accelerated architectures."
  • GPU programming with modern C++ - Michael Wong {ACCU 2019}
    • Brief: "Parallel programming can be used to take advance of multi-core and heterogeneous architectures and can significantly increase the performance of software. It has gained a reputation for being difficult, but is it really? Modern C++ has gone a long way to making parallel programming easier and more accessible; providing both high-level and low-level abstractions. C++11 introduced the C++ memory model and standard threading library which includes threads, futures, promises, mutexes, atomics and more. C++17 takes this further by providing high level parallel algorithms; parallel implementations of many standard algorithms; and much more is expected in C++20. The introduction of the parallel algorithms also opens C++ to supporting non-CPU architectures, such as GPU, FPGAs, APUs and other accelerators. This talk will show you the fundamentals of parallelism; how to recognise when to use parallelism, how to make the best choices and common parallel patterns such as reduce, map and scan which can be used over and again. It will show you how to make use of the C++ standard threading library, but it will take this further by teaching you how to extend parallelism to heterogeneous devices, using the SYCL programming model to implement these patterns on a GPU using standard C++."
  • CppCon 2018: Elmar Westphal "Using Template Magic to Automatically Generate Hybrid CPU/GPU-Code"
    • Brief: "In this talk you’ll learn how you can write code that will either compile into a CPU based loop or into a special kind of function called “kernel" to be executed on a GPU. You’ll get an introduction into the memory- and threading-models of recent GPUs and are provided with examples for (mostly) simple helper templates to manage them. You can test and debug your code on CPU and scale out later. In the end, you’ll be able to parallelise operations on vectors without having to think much about the architecture. Template magic will take of that for you. Note: there are several ways to leverage the compute power of GPUs for your applications. There are pragma-based approaches like OpenACC or recent versions of OpenMP. Or you can take more control and use approaches like Nvidia’s CUDA, AMD’s similar HIP or the latest versions of OpenCL. All of the latter are based on subsets of the C++-14 standard with extensions to manage the execution of code (at least) on GPUs. This session will cover a CUDA-C++ based approach, but the techniques shown should be applicable to other models as well. Elmar Westphal, Forschungszentrum Juelich Scientific Programmer"
  • CppCon 2016: Pablo Halpern "Introduction to Vector Parallelism"
    • Brief: "Parallel programming is a hot topic, and everybody knows that multicore processors and GPUs can be used to speed up calculations. What many people don't realize, however, is that CPUs provide another way to exploit parallelism – one that predates recent multicore processors, has less overhead, requires no runtime scheduler, and can be used in combination with multicore processing to achieve even more speedup. It's called vector parallelism, and the hardware that implements it goes by brand names like SSE, AVX, NEON, and Altivec. If your parallel program does not use vectorization, you could be leaving a factor of 4 to 16 in performance on the floor. In some ways, Vector programming is easier than thread-based parallel programming because it provides ordering guarantees that more closely resemble serial programming. Without an intuitive framework by which to interpret them, the ordering rules can be confusing, however, and restrictions on vector code that don't apply to thread-parallel code must be kept in mind. In this talk, we'll introduce you to the common elements of most vector hardware, show what kind of C++ code can be automatically vectorized by a smart compiler, and talk about programmer-specified vectorization in OpenMP as well as proposals making their way through the C++ standards committee. You'll understand the rules of vectorization, so that you can begin to take advantage of the vector units already in your CPU. A basic understanding of C++11 lambda expressions is helpful."
  • Heterogeneous Programming in C++ today - Michael Wong {ACCU 2018}
    • Brief: "So why is the world rushing to add Massive Parallelism to base languages when consortiums and companies have been trying to fill that space for years? How is the landscape of Heterogeneous Parallelism changing in the various standards, and specifications? How will today’s programming models address the needs of future Internet of Things, self-driving cars and Machine Learning. I will give an overview as well as a deep dive into what C, C++ is doing to add parallelism, but also how consortiums like Khronos OpenCL/SYCL is pushing forward into the High-level Modern C++ Language support for GPU/Accelerators and SIMD programming. And ultimately, how these will converge into the future C++ Standard through future C++20 proposals such as executors, and affinity from my capacity of leading many of these efforts as chair of Wg2 `s SG14."
  • Memory Resources in a Heterogeneous World - Michał Dominiak - CppCon 2019
    • Brief: "CUDA Thrust is a C++ parallel programming library built around the concepts and interfaces of the C++ standard library. When faced with the need for a composable interface for memory allocation in Thrust, we've reached to std::pmr - but std::pmr is inherently based around raw pointers, embedded deeply into signatures of virtual functions; this means it's not a great fit for a library that enables the use of GPUs for accelerated computation, which brings a need to handle different memory spaces in a type-safe way. Additionally, because accesses to memory are not uniform, the std::pmr model of pool resources doesn't quite work for CUDA and similar ecosystems. Thus came thrust::mr, which is a slight variation on std::pmr."
  • Efficient Array Computing in C++ with xtensor and Apache Arrow | SciPy 2017 | Sylvain Corlay
    • Brief: "This talk will discuss joint work between the xtensor and Apache Arrow open source projects, which can help enable the development of machine learning and other numerical computing applications. xtensor provides efficient multidimensional array computing for C++14 using expression templates, with Python bindings and NumPy interoperability. Apache Arrow provides cross-language array metadata and shared memory IO for moving tabular and tensor-like array data efficiently between compute environments."
  • https://www.youtube.com/watch?v=FRkJCvHWdwQ
  • EECE.6540 - Heterogeneous Computing - SIMD and Hardware Multithreading
  • Vector Forward Mode Automatic Differentiation on SIMD/SIMT architectures

1.6 Embedded Systems and Device Drivers

1.6.1 Fundamentals

  • The Design of C++0x - Bjarne Stroustrup - Texas A&M University [PRESENTATION]
  • Abstraction and the C++ Machine Model - Bjarne Stroustrup
    • http://www.stroustrup.com/abstraction-and-machine.pdf
    • Abstract: "C++ was designed to be a systems programming language and has been used for embedded systems programming and other resource-constrained types of programming since the earliest days. This paper will briefly discuss how C++'s basic model of computation and data supports time and space performance, hardware access, and predictability. If that was all we wanted, we could write assembler or C, so I show how these basic features interact with abstraction mechanisms (such as classes, inheritance, and templates) to control system complexity and improve correctness while retaining the desired predictability and performance."
  • Foundations of C++ - ETAPS 2012 Keynote - Bjarne Stroustrup
    • http://www.stroustrup.com/ETAPS-corrected-draft.pdf
    • Abstract: "C++ is a large and complicated language. People get lost in details. However, to write good C++ you only need to understand a few fundamental techniques – the rest is indeed details. This paper presents a few fundamental examples and explains the principles behind them. Among the issues touched upon are type safety, resource management, compile-time computation, error-handling, concurrency, and performance. The presentation relies on and introduces a few features from the recent ISO C++ standard, C++11, that simplify the discussion of C++ fundamentals and modern style."
  • Trends and future of C++: Evolving a systems language for performance - Bjarne Stroustrup
    • https://www.slideshare.net/slideshow/embed_code/key/tiw7gAcZOvRP88
    • "C++ maps directly onto hardware• Mapping to the machine – Simple and direct – Built-in types • fit into registers • Matches machine instructions• Abstraction – User-defined types are created by simple composition – Zero-overhead principle: • what you don’t use you don’t pay for • What you do use, you couldn’t hand code any better Stroustrup - Madrid11 13"
  • C Is Not a Low-level Language Your computer is not a fast PDP-11. - David Chisnall
    • https://queue.acm.org/detail.cfm?id=3212479
    • Abstract: "In the wake of the recent Meltdown and Spectre vulnerabilities, it's worth spending some time looking at root causes. Both of these vulnerabilities involved processors speculatively executing instructions past some kind of access check and allowing the attacker to observe the results via a side channel. The features that led to these vulnerabilities, along with several others, were added to let C programmers continue to believe they were programming in a low-level language, when this hasn't been the case for decades."
  • volatile type qualifier - CppReference
  • WG14 - N1956 - volatile semantics for lvalues
    • http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1956.htm
    • "The following sections discuss the C semantics of the volatile keyword and show that they neither support existing practice nor, we believe, reflect the intent of the committee when they were crafted. The Suggested Technical Corrigendum then details changes to the C specification required to bring it into harmony with both, as well as with C++."
  • ISO C++ Comitee - Const Correctness
  • Support for Embedded Programming in C++11 and C++14 - Scott Meyer
  • Modern C++ in embedded systems – Part 1: Myth and Reality - Dominic Herity
    • https://www.embedded.com/modern-c-in-embedded-systems-part-1-myth-and-reality/
    • Abstract: "… The suspicion lingers that C++ is somehow unsuitable for use in small embedded systems. For 8- and 16-bit processors lacking a C++ compiler, that may be a concern, but there are now 32-bit microcontrollers available for under a dollar supported by mature C++ compilers. As this article series will make clear, with the continued improvements in the language most C++ features have no impact on code size or on speed. Others have a small impact that is generally worth paying for. To use C++ effectively in embedded systems, you need to be aware of what is going on at the machine code level, just as in C. Armed with that knowledge, the embedded systems programmer can produce code that is smaller, faster and safer than is possible without C++."
  • Modern C++ embedded systems – Part 2: Evaluating C++ - Dominic Herity
    • https://www.embedded.com/modern-c-embedded-systems-part-2-evaluating-c/
    • "Having discussed the implementation of the main C++ language features in Part 1 of this series, we can now evaluate C++ in terms of the machine code it generates. Embedded system programmers are particularly concerned about code and data size; we need to discuss C++ in these terms."
  • Embedded programming with C++11 - Reiner Grimm [PRESETANTION]
  • Using C++ Efficiently In Embedded Applications - César A Quiroz
    • http://www.open-std.org/jtc1/sc22/wg21/docs/ESC_San_Jose_98_401_paper.pdf
    • "Abstract. Moving to C++ presents opportunities for higher programmer productivity. The requirements of embedded systems, however, demand that the adoption of C++ be carefully measured for the performance impact of run-time costs present in C++, but not in C. This talk suggests strategies for developers who are starting their acquaintance with C++."
  • C and C++ Embedded Software Nuggets- April 2018 - Mtthew Eshleman
  • Appendix A - A Tutorial for Real-Time C++
  • C++ Templates for Embedded Code Part 1
  • C++ Templates for Embedded Code Part 2
  • LetsDestroyC.md
    • https://gist.github.com/shakna-israel/4fd31ee469274aa49f8f9793c3e71163#lets-destroy-c
    • "I have a pet project I work on, every now and then. CNoEvil. The concept is simple enough. What if, for a moment, we forgot all the rules we know. That we ignore every good idea, and accept all the terrible ones. That nothing is off limits. Can we turn C into a new language? Can we do what Lisp and Forth let the over-eager programmer do, but in C?"
    • Shws how can C language features can be used for implementiong
      • Coroutines
      • Generics
      • New language constructs …

1.6.2 Motivation

1.6.3 Hardware Representation and MMIO - Memory Mapped IO

  • Modern C++ Withe paper: Making things do stuff - Gennan Carnie - Feabhas [BEST]
    • https://www.feabhas.com/sites/default/files/uploads/EmbeddedWisdom/Feabhas Modern C++ white paper Making things do stuff.pdf
    • "C has long been the language of choice for smaller, microcontroller-based embedded systems; particularly for close-to-the-metal hardware manipulation. C++ was originally conceived with a bias towards systems programming; performance and efficiency being key design highlights. Traditionally, many of the advancements in compiler technology, optimisation, etc., had centred around generating code for PC-like platforms (Linux, Windows, etc). In the last few years C++ compiler support for microcontroller targets has advanced dramatically, to the point where Modern C++ is an increasingly attractive language for embedded systems development. In this whitepaper we will explore how to use Modern C++ to manipulate hardware on a typical embedded microcontroller. We’ll see how you can use C++’s features to hide the actual underlying hardware of our target system and provide an abstract hardware API that developers can work to. We’ll explore the performance (in terms of memory and code size) of these abstractions compared to their C counterparts."
  • How to Combine Volatile with Struct - Michael Barr
    • https://embeddedgurus.com/barr-code/2012/11/how-to-combine-volatile-with-struct/
    • "C’s volatile keyword is a qualifier that can be used to declare a variable in such a way that the compiler will never optimize away any of the reads and writes. Though there are several important types of variables to declare volatile, this obscure keyword is especially valuable when you are interacting with hardware peripheral registers and such via memory-mapped I/O."
  • Representing and Manipulating Hardware in Standard C and C++ Embedded Systems Conference San Francico - Dan Saks
  • Memory-Mapped Devides as C++ Classes - Dan Saks
  • Volatile Objects - Dan Saks
    • https://www.dansaks.com/articles/1998-09 Volatile Objects.pdf
    • "For the past few months, I’ve been discussing the const qualifier, mostly with an eye on using const to place objects into ROM. I haven’t said all I have to say about const, but part of what I have left involves the volatile qualifier, as well. So this month, I’ll introduce you to the volatile qualifier. The volatile qualifier can appear anywhere that the const qualifier can. Whereas const declares objects that the program can’t change, volatile declares objects whose values might be changed by events outside the program’s control. A typical example of a volatile object is a memory-mapped input/output (I/O) port"
  • Exploiting C++'s features for efficient and safe hardware register access. - Pete Goodlife
    • https://accu.org/index.php/journals/281
    • https://www.drdobbs.com/cpp/register-access-in-c/184401954
    • Abstract: "Embedded programmers traditionally use C as their language of choice. And why not? It's lean and efficient, and allows you to get as close to the metal as you want. Of course C++, used properly, provides the same level of efficiency as the best C code. But we can also leverage powerful C++ features to write cleaner, safer, more elegant low-level code. This article demonstrates this by discussing a C++ scheme for accessing hardware registers in an optimal way."
  • Register Accesss in C++ - Pete Goodliffe
    • https://www.drdobbs.com/cpp/register-access-in-c/184401954
    • "Embedded programmers traditionally use C as their language of choice. And why not? It's lean and efficient, and lets you get as close to the metal as you want. Of course C++, used properly, provides the same level of efficiency as the best C code. Moreover, you can also leverage powerful C++ features to write cleaner, safer, more elegant low-level code. In this article, I present a C++ scheme for accessing hardware registers in an optimal way."
  • Device Registers in C - Colin Walls
    • https://www.embedded.com/device-registers-in-c/
    • Abstract: "One of the key benefits of the C language, which is the reason it is so popular for embedded applications, is that it is a high-level, structured programming language, but has low-level capabilities. The ability to write code that gets close to the hardware is essential and C provides this facility. This article looks at how C may be used to access registers in peripheral devices."
  • Access Memory Mapped I/O - Stack Overflow
  • Accessing memory-mapped classes directly - Dan Saks - 2010
    • https://www.eetimes.com/accessing-memory-mapped-classes-directly/
    • "If you think using pointers to access memory-mapped devices is too slow, here are some alternatives you can try. Device drivers typically communicate with hardware devices through device registers. Many processors use memory-mapped I/O, which maps device registers to fixed addresses in the conventional memory space. A typical device employs a small collection of registers with closely-spaced memory addresses. …"
  • Representing Memory-Mapped Devices as Objects - Dan Saks
  • Programming TMS320x28xx and TMS320x28xxx Peripherals in C/C++
  • A guide to better embedded C++ - GNSS C++ Solutions
  • Placing C variables at specific addresses to access memory-mapped peripherals
    • http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka3750.html
    • "In most ARM embedded systems, peripherals are located at specific addresses in memory. It is often convenient to map a C variable onto each register of a memory-mapped peripheral, and then read/write the register via a pointer. In your code, you will need to consider not only the size and address of the register, but also its alignment in memory. "
  • Arm MBED - C/C++ I/O Register Names
    • https://os.mbed.com/users/4180_1/notebook/cc-io-register-names/
    • "Normally you would always use mbed's I/O APIs from the Handbook to write programs for mbed since they are easier to use and you will be more productive working at a higher level of abstraction. Communication with I/O devices is done using special I/O registers that control the I/O device hardware. On a RISC processor, these I/O registers are typically mapped into memory locations at a fixed (absolute) address. A memory address space map in the figure below shows the areas used for I/O registers. In the rare case that it is necessary to directly communicate with I/O hardware or you want to experiment and write your own I/O drivers, there are already predefined I/O register names for mbed's I/O registers. C/C++ I/O register names appear in LPC17xx.h and it uses 32-bit hex constants to setup all of the correct addresses for the registers. Each I/O hardware unit has a name "LPC_hardwareunit". For each hardware unit, register names have been setup in a C structure at the correct address. In most cases, these are the register names used in the LPC1768 Users manual."

1.6.4 ISR - Interrupt Service Routine

  • Interrupts in C - Alan Dorfmeyer and Pat Baird.
    • https://www.embedded.com/interrupts-in-c/
    • "An ideal C++ device driver would be a class containing, among other things, the ISR as a member function. But this is harder to achieve than many C programmers assume. One of the goals of a recent project was to evaluate the effectiveness of C++ in writing low-level device drivers. With a push to reduce time to market, we were given a budget large enough to order some nice object modeling tools."
  • Implementing Interrupt Service Routines in C+ - Bill Gatliff
    • https://www.drdobbs.com/implementing-interrupt-service-routines/184401485?pgno=1
    • "Some people say that C++ has poor support for interrupt handler implementations. Others claim that ISRs (interrupt service routines) simply can’t be implemented in C++ at all, or, if they can, they’re terribly inefficient when compared to equivalent C or assembly language implementations. The truth is that you can implement interrupt handlers in C++, and you can do so with the same low overhead imposed by C. The secret to success lies in understanding how to use C++’s language features properly, and in knowing how to organize things to take advantage of the inherent differences between the C and C++ ways of solving problems. This article presents two different techniques for implementing interrupt handlers in C++. Each has its own set of advantages and disadvantages, but odds are that at least one of them is appropriate for whatever embedded application you are developing now."

1.6.5 ROM Read-Only Memory, ROM-able types Objects and constexpr

  • General Constant Expression for System Programming Languages - Gabriel Dos Reis and Bjarne Stroustrup
    • http://www.stroustrup.com/sac10-constexpr.pdf
    • Abstract: "Most mainstream system programming languages provide support for builtin types, and extension mechanisms through userdefined types. They also come with a notion of constant expressions whereby some expressions (such as array bounds) can be evaluated at compile time. However, they require constant expressions to be written in an impoverished language with minimal support from the type system; this is tedious and error-prone. This paper presents a framework for generalizing the notion of constant expressions in modern system programming languages. It extends compile time evaluation to functions and variables of user-defined types, thereby including formerly ad hoc notions of Read Only Memory (ROM) objects into a general and type safe framework. It allows a programmer to specify that an operation must be evaluated at compile time. Furthermore, it provides more direct support for key meta programming and generative programming techniques. The framework is formalized as an extension of underlying type system with a binding time analysis. It was designed to meet real-world requirements. In particular, key design decisions relate to balancing experssive power to implementability in industrial compilers and teachability. It has been implemented for C++ in the GNU Compiler Collection, and is part of the next ISO C++ standard."
  • Bitesize Modern C++ : constexpr - Gleannan Carnie
  • Modern C++ embedded systems – Part 2: Evaluating C++
  • C++ CONSTEXPR COMPILE-TIME LOOKUP TABLE GENERATION
  • Exploring constexpr at Runtime WG21 / N32583
  • Compile-time cosine lookup table with C++
  • C++ - Generating Lookup Tables at Compile-Time
  • Use constexpr for faster, smaller, and safer code
    • https://blog.trailofbits.com/2019/06/27/use-constexpr-for-faster-smaller-and-safer-code/
    • "With the release of C++14, the standards committee strengthened one of the coolest modern features of C++: constexpr. Now, C++ developers can write constant expressions and force their evaluation at compile-time, rather than at every invocation by users. This results in faster execution, smaller executables and, surprisingly, safer code. Undefined behavior has been the source of many security bugs, such as Linux kernel privilege escalation (CVE-2009-1897) and myriad poorly implemented integer overflow checks that are removed due to undefined behavior. The C++ standards committee decided that code marked constexpr cannot invoke undefined behavior when designing constexpr. For a comprehensive analysis, read Shafik Yaghmour’s fantastic blog post titled 'Exploring Undefined Behavior Using Constexpr.'"
  • Exploring Undefined Behavior Using Constexpr - Shafik Yaghmour

1.6.6 C++ Standard Papers Proposals and Freestanding library

  • Freestanding vs. hosted implementations - Dan Saks
  • P0829R2 - Freestanding Proposal
    • http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0829r2.html
    • "Add everything to the freestanding implementation that can be implemented without OS calls and space overhead. The current definition of the freestanding implementation is not very useful. Here is the current high level definition from [intro.compliance]: Two kinds of implementations are defined: a hosted implementation and a freestanding implementation. For a hosted implementation, this document defines the set of available libraries. A freestanding implementation is one in which execution may take place without the benefit of an operating system, and has an implementation-defined set of libraries that includes certain language-support libraries ([compliance])."
    • Note: "A freestanding version of the standard library is intended for use without OS (Operating System) support or with limited OS support such as in device drivers."
  • P1377R0 - Summary of Dec 2018 SG14 freestanding discussions
  • P0709 - Zero-overhead deterministic exceptions: Throwing values
    • http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0709r0.pdf
    • "This paper aims to extend C++’s exception model to let functions declare that they throw a statically specified type by value. This lets the exception handling implementation be exactly as efficient and deterministic as a local return by value, with zero dynamic or non-local overheads."
  • P1028R0: SG14 status_code and standard error object for P0709 Zero-overhead deterministic exceptions
  • Non-throwing container operations
    • http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0132r1.html
    • "This paper explores alternatives for adding non-throwing container operations, namely alternatives to throwing exceptions from failing modifications. Based on LEWG feedback from Jacksonville 2018 meeting, the focus is on minor additions to existing container APIs, instead of completely-custom allocators or completely-new containers. This paper suggests an evolutionary step and asks LEWG to clarify that the step is in the right direction."
  • P0037R5 - Fixed-Point Real Numbers
    • http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0037r5.html
    • See: Compositional Numeric Library
    • "This proposal introduces a system for performing fixed-point arithmetic using integral types. Floating-point types are an exceedingly versatile and widely supported method of expressing real numbers on modern architectures. However, there are certain situations where fixed-point arithmetic is preferable: Some systems lack native floating-point registers and must emulate them in software; many others are capable of performing some or all operations more efficiently using integer arithmetic; certain applications can suffer from the variability in precision which comes from a dynamic radix point [pathengine]; in situations where a variable exponent is not desired, it takes valuable space away from the significand and reduces precision and not all hardware and compilers produce exactly the same results, leading to non-deterministic results."
  • A Standard Audio API for C++: Motivation, Scope, and Basic Design
    • http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1386r2.pdf
    • "This paper proposes to add a low-level audio API to the C++ standard library. It allows a C++ program to interact with the machine’s sound card, and provides basic data structures for processing audio data. We argue why such an API is important to have in the standard, why existing solutions are insufficient, and the scope and target audience we envision for it. We provide a brief introduction into the basics of digitally representing and processing audio data, and introduce essential concepts such as audio devices, channels, frames, buffers, and samples. We then describe our proposed design for such an API, as well as examples how to use it. An implementation of the API is available online. Finally, we mention some open questions that will need to be resolved, and discuss additional features that are not yet part of the API as presented but will be added in future papers."

1.6.7 Memory Allocation and allocators

1.6.8 Embedded Systems in Real World

  • CODE BLUE 2014 : A security assessment study and trial of Tricore-powered automotive ECU by DENNIS KENGO OKA & TAKAHIRO MATSUKI
  • C++ Architecture for UAV Simulations
    • https://www.researchgate.net/publication/299771353_C_Architecture_for_UAV_Simulations
    • "The C++ computer language is well suited to model multi-vehicle engagements. Its prowess is exemplified by the conversion of a unmanned aerial vehicle simulation from FORTRAN to C++. The new architecture accommodates besides UAVs and moving targets also targeting satellites. Its class structure is outlined, and the communication bus between the encapsulated vehicle-objects is discussed. A generic UAV model with five degrees-of-freedom fidelity is used to demonstrate the interactive features of the simulation. Our experience has shown that C++ is the programming environment of choice for networked simulation"
  • Autonomous Flight of a Quadrocopter Group with the Use of the Virtual Leader Strategy
    • http://ceur-ws.org/Vol-2500/paper_16.pdf
    • "In this article we present an algorithm of controlling the quadrocopters swarm and a theory of applying the Kalman filter for the equations of motion of a quadrocopter in mountainous conditions. In our case, in order to coordinate the group, it is necessary to form the spatial programmatic trajectory of the UAV using the appropriate control law. The concept of coordinated reversal is introduced, which allows to obtain analytical equations of spatial motions expressed through the definition of the velocity vector and the yaw angle. The algorithm was tested in the Gazebo simulator. The results are used for spatial motion of quadrocopter groups."
  • UAV Flight Experiments with a RT-Linux Autocode Environment including a Navigation Filter and a Spline Controller.

1.6.9 Testing and HIL - Hardware-In-The Loop Simulation

  • Real-time simulation system aids complex system design
    • https://www.embedded.com/real-time-simulation-system-aids-complex-system-design/
    • Abstract: "Components and subsystems that form large, complex systems such as automobiles and aircraft need testing before the entire system is built. An engine-control unit (ECU) for example, has numerous sensors that must be simulated to test how the ECU responds to normal and abnormal conditions. While hardware-in-the-loop (HIL) systems have been around for years, Bloomy Controls has developed a system that can handle most of the inputs and outputs needed to simulate a system. You add the customization."
  • Matlab EXPO 2016 - Ein Modell Viele Zielsystemem - Automastiche Codegeneirung aus Matlab Simulink
  • HIL Simulator of Drives of an Industrial Robot with 6 DOF
    • https://pdfs.semanticscholar.org/5010/2ed032104d414b6e5c696afcbac42c369833.pdf
    • Abstract: "The paper deals with design of a Hardware-inthe-Loop simulator of an industrial robot with six degrees of freedom. The robot is driven by industrial frequency converters of the SINAMICS S120 type. They communicate via CAN bus with the master control system RT-LAB executing control algorithms in real time. Such a complex task combines information from mechanics, electric drives, control theory, robotics, programming, and a deep knowledge of a frequency converter control structure. Proposed algorithms are verified experimentally and the resulting time responses show good agreement with expected results."
  • FPGA based Hardware-in-the-Loop Simulation for Digital Control of Power Converters using VHDL-AMS
    • http://ijarai.thesai.org/Downloads/Volume9No12/Paper_73-FPGA_based_Hardware_in_the_Loop_Simulation.pdf
    • "This paper presents a new approach for complex system design, allowing rapid, efficient and low-cost prototyping. Using this approach can simplify designing tasks and go faster from system modeling to effective hardware implementation. Designing multi-domain systems require different engineering competences and several tools, our approach gives a unique design environment, based on the use of VHDL-AMS modeling language and FPGA device within a single design tool. This approach is intended to enhance hardware-in-the-loop (HIL) practices with a more realistic simulation which improve the verification process in the system design flow. This paper describes the implementation of a software/hardware platform as effective support for our methodology. The feasibility and the benefits of the presented approach are demonstrated through a practical case study of a power converter control. The obtained results show that the developed method achieves significant speed-up compared with conventional simulation methods, using minimum resources and minimum latency."
  • Model- and Hardware-in-the-Loop Testing in a Model-Based Design Workflow
    • http://lup.lub.lu.se/luur/download?func=downloadFile&recordOId=8776530&fileOId=8776533
    • "Model-Based Design is a development method that is becoming popular to use when creating control systems. In this thesis a demonstration of the advantages of using this method is made for Combine Control Systems AB. The 3D simulation software IndustrialPhysics is used to represent a real process in form of a gantry crane. A controller for this crane is developed in Simulink and Model-in-the-Loop (MiL) testing is done together with the 3D model. C code is then generated from the controller and transferred to a PLC. A control panel with buttons is connected to the PLC and Hardware-in-the-Loop (HiL) testing is done together with the 3D model. The result of the thesis is a working HiL rig ready to be used on technical fairs to demonstrate the capabilities of the Model-Based Design method."
  • A Framework for Real Time Hardware in the loop Simulation for Control Design
    • https://arxiv.org/pdf/1410.1342.pdf
    • "This paper presents a simple framework of low cost Kit which can be used in control education and training courses to support hardware in the loop simulation. The kit shows the student or control engineer the effect of delays, noise, and saturation on the control system. The framework is generic and flexible to give the user the ability to test and simulate any controller on any process. The framework uses Matlab® environment which gives the user many tools to build his/her system in a fast and accurate way. Some test cases are presented for using the framework on different controllers."
  • HIL - Hardwarwe-In-The-Loop simulation - using master-slave *computational device (In Portuguese)
    • http://www.bibl.ita.br/viiiencita/Simulacao - hardware in loop-.pdf
    • "Abstract: This work studied the possibility to apply the computational hardware in the configuration master-slave in simulation hardware-in-the-Loop (HIL), being the control made by a micro-CLP. In addition, it searched to identify factors that limit the application of this system, considering the attempt to use it for the control of a known physical model: a system of magnetic levitation."
  • Hardware In The Loop simulation applied to Unmaned Underwater Vehicles (UUVs) (In Portuguese)
    • https://www.teses.usp.br/teses/disponiveis/3/3152/tde-09022009-164239/publico/Harware_in_The_Loop_Simulation_UUV.pdf
    • "Unmanned Underwater Vehicles (UUVs) have many commercial, military, and scientific applications because of their potential capabilities and significant cost performance improvements over traditional means of obtaining valuable underwater information The development of a reliable sampling and testing platform for these vehicles requires a thorough system design and many costly at-sea trials during which systems specifications can be validated. Modeling and simulation provide a cost-effective measure to carry out preliminary component, system (hardware and software), and mission testing and verification, thereby reducing the number of potential failures in at-sea trials. An accurate simulation environment can help engineers to find hidden errors in the UUV embedded software and gain insights into the UUV operation and dynamics. This work describes the implementation of a UUV's control algorithm using MATLAB/SIMULINK, its automatic conversion to an executable code (in C++) and the verification of its performance directly into the embedded computer using simulations. It is detailed the necessary procedure to allow the conversion of the models from MATLAB to C++ code, integration of the control software with the real time operating system used on the embedded computer (VxWORKS) and the developed strategy of Hardware in the loop Simulation (HILS). The Main contribution of this work is to present a rational framework to support the final implementation of the control software on the embedded computer, starting from the model developed on an environment friendly to the control engineers, like SIMULINK."
  • Hardware in the Loop Robot Simulators for On-site and Remote Education in Robotics
    • https://www.ijee.ie/articles/Vol22-4/12_ijee1671.pdf
    • "LIKE MOST FIELDS in engineering, hands-on education in control, mechatronics and robotics requires the development of laboratories that provide a variety of experiments, flexibility and ease-of-use. However, high investment and maintenance costs as well as safety issues related to those labs pose important limitations and call for serious consideration to be given to the choice of equipment and design of experiments. Resorting to off-site facilities is another way to address the above limitations, which is an approach gaining increasing attraction as an enhancement or alternative to conventional education tools. Remote labs are a recent and rapidly growing outcome of Information Technology, providing environments to which users are given access from anywhere in the world using the Internet in order to perform experiments, watch the performance and/or collect back data for analysis. In this paper, a novel hardware-in-the-loop (HIL) simulator setup is proposed as an efficient laboratory tool in the education of robotics, mechatronics, and control. The utilization of this novel approach as an on-site and remote experimentation tool is also discussed in detail for robotics education."
  • A Hardware-In-The-Loop Simulator for Software Development for a Mars Airplane

1.6.10 Frameworks and Libraries

  • C++ Embedded Frameworks
  • Embedded Template Library
    • https://www.etlcpp.com/
    • "C++ is a great language to use for embedded applications and templates are a powerful aspect of it. The standardlibrary can offer a great deal of well tested functionality, but there are some parts that do not fit well withdeterministic behaviour and limited resource requirements. These limitations usually preclude the use of dynamicallyallocated memory which means that the STL containers are unusable."

1.6.11 Debugging

1.6.12 Error Handling

  • C’s goto Keyword: Should we Use It or Lose It? - Michael Barr
  • Unified error handling for microcontrollers(C++)
    • https://itnan.ru/post.php?c=1&p=456540
    • "Using of C++ in embedded software development could very often face an issue that standard libraries usage causes undesirable additional resources consumption of ROM and RAM. That's why some classes and methods from 'std' library doesn't suits for implementation in microcontrollers. There are dynamic memory (heap), RTTI and exceptions usage restrictions in the embedded software development. In order to create compact and quick working code we couldn't just use 'std' library, and for example, 'typeid' operator, because RTTI support is needed and this is an overhead in common case. Sometimes one have to «reinvent the wheel» to satisfy that conditions. The number of such tasks is small, but they are still need to be done. The article describes an easy task from the first sigh — return codes expansion for the existing subsystems in embedded software."
  • Error Handling now and tomorrow
  • Use of Assertions / Embedded in Academia - John Regehr
  • 8 tips for squashing bugs using ASSERT in C - Jacob Beningo
  • How to Define Your Own assert() Macro for Embedded Systems
  • How and When to Use C's assert() Macro
    • https://barrgroup.com/embedded-systems/how-to/use-assert-macro
    • "The assert() macro is one of those simple tools that would not seem to merit an entire article, but I have come across an alarming number of engineers who have not heard of it or do not use it. Hopefully this article will help bolster the number who make good use of this feature. In this article, we will look at appropriate use of assertions, and in the follow-on article How to Define Your Own assert() Macro for Embedded Systems, we will examine how we can write the assert() macro ourselves."
  • 14.10.17 Using assert() in Embedded Systems
  • Inception: System-Wide Security Testing of RealWorld Embedded - Systems Software Nassim Corteggiani (Maxim Integrated / EURECOM

1.6.13 Embedded Linux

1.6.14 Reverse Engineering

1.6.15 Miscellaneous

  • Software emulation of STM32 controller for virtual embedded design/test environment - Joshi, C.V.
    • https://pure.tue.nl/ws/portalfiles/portal/138967031/1377566_MScThesisChandrika_Joshi_submission.pdf
    • Abstract: "Integrating the emulated Hardware into the embedded test environment facilitates iterative and modular testing of Embedded SoftWare (ESW) at the initial phases of the ESW Development Life Cycle (EDLC). Emulation technology eliminates the dependency on hardware and facilitates ESW testing to identify the defects in the early stages of ESW Development. Hardware emulation has been around in the industry for testing the hardware design using Verilog & hardware design simulators like HILO. Significant amounts of hardware design testing carried out before the fabrication of hardware chips has proven to be cost effective and saving effort for the hardware design and development [40]. Recently, there have been advancements in the software emulation for Embedded devices paving its way into Embedded SDLC [1]. In this thesis, a detailed study is conducted on the existing verification techniques and their drawbacks with respect to the embedded system testing. Based on this study, an implementation of the STM32f407ve controller emulation is carried out on an open-source platform by Fabrice Ballard-QEMU [9]. QEMU provides essential APIs to develop and use the emulated hardware board to achieve virtualization. Initial work has been carried out by freelancers on QEMU to build various boards, which have been referred for this project to develop the specific board of STM32 for the test environment at Vanderlande. The development includes adding the hardware machine emulation of STM32 to the QEMU with the emulated peripherals clock control and GPIO. The emulated hardware has been examined to understand the behavior and performance concerning the functional testing, time-based testing, CPU load as compared to the real hardware. This thesis initiates a view towards utilizing the virtual test environments for Embedded SDLC over traditional test setups"

1.6.16 Case studies of high profile bugs and design flaws

See also:

1.7 Low-Level, Kernel and System Programming

1.8 Standard Proposals

  • p1040R0: std::embed
    • Brief: Accessing program-external resources at compile-time and making them available to the developer. It aims to make easier to embed files, pictures, binary files and documents in executables or shared library C++ binaries.
    • Abstract: This paper introduces a function std::embed in the <embed> header for pulling resources at compile-time into your program and optionally guaranteeing that they are stored in the resulting program in an implementation-defined manner.
  • P0194R0 - Static Reflection Revision 4.
    • Abstract This paper is the follow-up to N3996, N4111 and N4451 and it is the fourth revision of the proposal to add static reflection to the C++ standard. It also introduces and briefly describes a partial, experimental implementation of this proposal.
  • p0707r0 Metaclasses for static reflection.
  • P1028R0 status_code and standard error object for P0709 zero-overhead deterministic exceptions.
  • p0037r5 - Fixed-point arithmetics using integral types.
  • P0784R6 - More constexpr containers
    • "Variable size container types, like std::vector or std::unordered_map, are generally useful for runtime programming, and therefore also potentially useful in constexpr computations. This has been made clear by some recent experiments such as the Constexpr ALL the things! presentation (and its companion paper P0810R0 published in the pre-Albuquerque mailing) by Ben Deane and Jason Turner, in which they build a compile-time JSON parser and JSON value representation using constexpr. Amongst other things, the lack of variable size containers forces them to use primitive fixed-size data structures in the implementation, and to parse the input JSON string twice; once to determine the size of the data structures, and once to parse the JSON into those structures."

1.9 Books and Literature

1.9.1 C++ Programming Language

  • Bjarne Strustrup. The C++ Programming Language, 4th Edition
    • Coverage:
      • Structures unions and enumerations.
      • C++11, Classes - construction, copy, cleanup and move.
      • Standard library (STL Contaienrs, STL Algorithms, STL Iterators, Memory and Resources, I/O Streams, Numeric, Concurrency),.
      • Templates: Insntatiation, generic programming, specialization.
    • Amazon Link
  • Bruce Eckel. Thinking in C++. 1995
    • Notes: Despite being an old book, it has a step-by-step coverage of C++ main concepts and some design patterns.
  • Bjarne Stroustrup. Tour of C++ second edition
  • Andrew Koenig and Barbara E. Moo. Accelerated C++: Practical Programming by Example
  • Andrei Alexandrescu. Modern C++ Design: Generic Programming and Design Patterns Applied 1st Edition. 2001
    • Notes: Provides a comprehensive and broad coverage of C++ generic/template metaprogramming.
    • Link: Amazon
  • Scott Meyers. Effective C++ Third Edition, 55 Specific Ways to Improve Your Programs and Designs

    More Advanced Books:

  • Martin Reddy. API Design for C++

1.9.2 C++17

1.9.3 System Programming - POSIX / Linux and UNIX

Note: Most of those books use C because operating system services and low level system libraries are exposed in C language and most used operating systems nowadays were written in C. In addition, C++ still doesn't have a stable and standardized ABI (Application Binary Interface like C).

Books about Linux C-APIs are not only useful for this operating system, but also for other Unix-based OSes such as MacOSx, BSD, Android (Based on Linux), QNX Rtos and so on.

  • Michael Kerrisk. The Linux Programming Interface - 2010 - ISBN 978-1-59327-220-3
  • Kurt Wall et al. Linux Programming Unleashed - 1999
    • Covers low level system calls; process control; thread-synchronization primitives; TCP/IP sockets and network; shared memory and XWidows system/Xlib user interface.
    • The most used language in the book is C, although there are some examples in C++.
    • Amazon link to second edition
  • Linux Device Drivers, 3rd Edition - By Jonathan Corbet, Greg Kroah-Hartman, Alessandro Rubini
    • http://www.makelinux.co.il/ldd3/
    • "Over the years, this bestselling guide has helped countless programmers learn how to support computer peripherals under the Linux operating system, and how to develop new hardware under Linux. Now, with this third edition, it's even more helpful, covering all the significant changes to Version 2.6 of the Linux kernel. Includes full-featured examples that programmers can compile and run without special hardware."

1.9.4 System Programming - POSIX / MacOSX and UNIX

1.9.5 System Programming - Microsft Windows NT

  • Mark Russinovitch et al - Windows Internals - 5th edition - Microsft Press 2000.
    • Coverage: Windows API, Virtual Memory, Kernel Mode X User Mode, Terminal, Object and Handles, Registry, Sysinternals Tools, Kernel System Components, System Calls, Windows Sockets (Winsock), NetBIOS, NTFS file system.
    • Amazon Link (6th edition)
  • Charles Petzold - Windows Programming - Microsoft Press - 5th edition - 1998
    • Coverage: Win32 API, windows graphical stack, GDI (Graphics Device Interface), Dynamic Linked Libraries DLLs.
    • Amazon Link
  • Johnson M. Hart, Win32 System Programming: A Windows® 2000 - Application Developer's Guide, 2nd Edition, Addison - Wesley, 2000.
    • Note: This book discusses select Windows programming problems and addresses the problem of portable programming by comparing Windows and Unixapproaches.
    • Amazon Link
  • Jeffrey Richter, Programming Applications for Microsoft Windows, 4th Edition, Microsoft Press, September 1999.
    • Note: This book provides a comprehensive discussion of the Windows API suggested reading.
  • Visual Basic - Programmer’s Guide to the Win32 API, The Authoritative Solution by Dan Appleman
  • Don Box - Essential COM 1st edition - 1998 - Addison-Wesley Professional - ISBN 978-0201634464
    • Comprehensive coverage of COM - Component Object Model.
    • Amazon Link

1.9.7 Scientific and Technical Computing

  • Discovering Modern C++: An Intensive Course for Scientists, Engineers, and Programmers (C++ In-Depth Series) 1st Edition

1.9.8 Computer Graphics

  • OpenGLBook [ONLINE, FREE] - http://openglbook.com/
    • "OpenGLBook.com is a free OpenGL programming tutorial in online book format. Click on The Book to start learning OpenGL 4.0. Several chapters contain OpenGL 3.3 compatible code samples in a sub-directory named "compatibility" in the source code listing, if you only have access to OpenGL 3 / DirectX 10 level hardware."
  • Anton's OpenGL 4 Tutorials - Anton Gerdelan
    • Amazon link
    • "This book is a practical guide to starting 3d programming with OpenGL, using the most recent version. It would suit anyone learning 3d programming that needs a practical guide with some help for common problems. The material is often used in this way by university courses and hobbyists. This book is a collection of worked-through examples of common real-time rendering techniques as used in video games or student projects. There are also some chapters or short articles for Tips and Tricks - not-so-obvious techniques that can add a lot of value to projects or make it easier to find problems. The idea is to be something like a lab manual - to get you going and over the trickier and more confusing hurdles presented by the API."
  • Real-Time Rendering, Fourth Edition - 4th Edition
    • Amazon link
    • "Thoroughly updated, this fourth edition focuses on modern techniques used to generate synthetic three-dimensional images in a fraction of a second. With the advent of programmable shaders, a wide variety of new algorithms have arisen and evolved over the past few years. This edition discusses current, practical rendering methods used in games and other applications. It also presents a solid theoretical framework and relevant mathematics for the field of interactive computer graphics, all in an approachable style. New to this edition: new chapter on VR and AR as well as expanded coverage of Visual Appearance, Advanced Shading, Global Illumination, and Curves and Curved Surfaces."
    • Key Features:
      • Covers topics from essential mathematical foundations to advanced techniques used by today’s cutting edge games.
      • Case studies are grounded in specific real-time rendering technologies.
      • Revised and revamped for its updated fourth edition, which focuses on modern techniques and used to generate three-dimensional images in a fraction of time old processes took.
      • Covers practical rendering for games to math and details for better interactive applications.
  • Physically Based Rendering: From Theory to Implementation - 3rd Edition
    • Companion Web Site: http://www.realtimerendering.com/
    • Amazon link
    • "Physically Based Rendering: From Theory to Implementation, Third Edition, describes both the mathematical theory behind a modern photorealistic rendering system and its practical implementation. Through a method known as 'literate programming', the authors combine human-readable documentation and source code into a single reference that is specifically designed to aid comprehension. The result is a stunning achievement in graphics education. Through the ideas and software in this book, users will learn to design and employ a fully-featured rendering system for creating stunning imagery. This completely updated and revised edition includes new coverage on ray-tracing hair and curves primitives, numerical precision issues with ray tracing, LBVHs, realistic camera models, the measurement equation, and much more. It is a must-have, full color resource on physically-based rendering."
  • Real-Time Collision Detection - (The Morgan Kaufmann Series in Interactive 3-D Technology) Hardcover – December 22, 2004
    • Amazon link
    • "Written by an expert in the game industry, Christer Ericson's new book is a comprehensive guide to the components of efficient real-time collision detection systems. The book provides the tools and know-how needed to implement industrial-strength collision detection for the highly detailed dynamic environments of applications such as 3D games, virtual reality applications, and physical simulators. Of the many topics covered, a key focus is on spatial and object partitioning through a wide variety of grids, trees, and sorting methods. The author also presents a large collection of intersection and distance tests for both simple and complex geometric shapes. Sections on vector and matrix algebra provide the background for advanced topics such as Voronoi regions, Minkowski sums, and linear and quadratic programming. Of utmost importance to programmers but rarely discussed in this much detail in other books are the chapters covering numerical and geometric robustness, both essential topics for collision detection systems. Also unique are the chapters discussing how graphics hardware can assist in collision detection computations and on advanced optimization for modern computer architectures. All in all, this comprehensive book will become the industry standard for years to come."
  • Computer Graphics: Principles and Practice in C - 2nd Edition
    • Amazon link
    • "A guide to the concepts and applications of computer graphics covers such topics as interaction techniques, dialogue design, and user interface software."

1.9.9 Coding Practices and Software Engineering

  • Design Patterns: Elements of Reusable Object-Oriented Software - (Addison-Wesley Professional Computing Series) 1st Edition
    • Amazon link
    • "Capturing a wealth of experience about the design of object-oriented software, four top-notch designers present a catalog of simple and succinct solutions to commonly occurring design problems. Previously undocumented, these 23 patterns allow designers to create more flexible, elegant, and ultimately reusable designs without having to rediscover the design solutions themselves. The authors begin by describing what patterns are and how they can help you design object-oriented software. They then go on to systematically name, explain, evaluate, and catalog recurring designs in object-oriented systems. With Design Patterns as your guide, you will learn how these important patterns fit into the software development process, and how you can leverage them to solve your own design problems most efficiently. Each pattern describes the circumstances in which it is applicable, when it can be applied in view of other design constraints, and the consequences and trade-offs of using the pattern within a larger design. All patterns are compiled from real systems and are based on real-world examples. Each pattern also includes code that demonstrates how it may be implemented in object-oriented programming languages like C++ or Smalltalk."
  • Writing Solid Code - (20th Anniversary 2nd Edition) Paperback – 2013
    • Amazon link
    • "Written by a former Senior Level Microsoft developer, this book takes on the problem of software errors by examining the kinds of mistakes that developers typically make. With the growing complexity of software today and the associated climb in bug rates, it's becoming increasingly necessary for programmers to produce bug-free code much earlier in the development cycle, before the code is first sent to the testing group. The key to writing bug-free code is to become more aware of how and why bugs come about. Programmers can gain this awareness by asking two simple questions for every bug they encounter: "How could I have prevented this bug?" and "How could I have automatically detected this bug?" The guidelines presented in this book are the results of programmers regularly asking these questions for every bug they've had to track down over years of programming. WRITING SOLID CODE provides practical approaches to prevention and automatic detection of bugs. Throughout, Steve Maguire draws candidly on the history of application development at Microsoft for cases in point-both good and bad-and shows you how to use proven programming techniques to write rock-solid code. If you're serious about developing world-class code, you'll benefit from Maguire's experience and practical advice in WRITING SOLID CODE."
  • Design by Contract, by Example 1st Edition - Richard Mitchel, Jim McKim and Betrand Meyer
    • Amazon link
    • "Design by contract is an underused–but powerful–aspect of the object-oriented software development environment. With roots in the Eiffel programming language, it has withstood the test of time, and found utility with other programming languages. Here, by using both the Eiffel and Java languages as guidance, Design by Contract, by Example paves the way to learning this powerful concept."
    • Through the following six teaching principles, the authors demonstrate how to write effective contracts and supporting guidelines. Readers will learn how to:
      • Separate queries from commands
      • Separate basic queries from derived queries
      • Write a postcondition for each derived query that specifies what result can be returned
      • Write a postcondition for each command that specifies the value of every basic query
      • Decide on a suitable precondition for every query and command
      • Write invariants to define unchanging properties of objects
      • Contracts are built of assertions, which are used to express preconditions, postconditions and invariants. Using the above principles, the authors provide a frank discussion of the benefits, as well as the potential drawbacks, of this programming concept. Insightful examples from both the Eiffel and Java programming languages are included, and the book concludes with a summary of design by contract principles and a cost-benefit analysis of their applications.

1.9.10 Miscellaneous Online Books

  • Elements of Programming (Alexander A. Stepanov)
  • Contents — Professional Software Development 2019.01
  • The Architecture of Open Source Applications
    • "Architects look at thousands of buildings during their training, and study critiques of those buildings written by masters. In contrast, most software developers only ever get to know a handful of large programs well—usually programs they wrote themselves—and never study the great programs of history. As a result, they repeat one another's mistakes rather than building on one another's successes. Our goal is to change that. In these two books, the authors of four dozen open source applications explain how their software is structured, and why. What are each program's major components? How do they interact? And what did their builders learn during their development? In answering these questions, the contributors to these books provide unique insights into how they think. "
  • OpenGLBook [ONLINE, FREE] - http://openglbook.com/
    • "OpenGLBook.com is a free OpenGL programming tutorial in online book format. Click on The Book to start learning OpenGL 4.0. Several chapters contain OpenGL 3.3 compatible code samples in a sub-directory named "compatibility" in the source code listing, if you only have access to OpenGL 3 / DirectX 10 level hardware."
  • Linux Device Drivers, 3rd Edition - By Jonathan Corbet, Greg Kroah-Hartman, Alessandro Rubini
    • http://www.makelinux.co.il/ldd3/
    • "Over the years, this bestselling guide has helped countless programmers learn how to support computer peripherals under the Linux operating system, and how to develop new hardware under Linux. Now, with this third edition, it's even more helpful, covering all the significant changes to Version 2.6 of the Linux kernel. Includes full-featured examples that programmers can compile and run without special hardware."

1.10 ABI - Application Binary Inteface

1.10.1 Itanium Portable ABI

  • Itanium C++ ABI
    • "The Itanium C++ ABI is an ABI for C++. As an ABI, it gives precise rules for implementing the language, ensuring that separately-compiled parts of a program can successfully interoperate. Although it was initially developed for the Itanium architecture, it is not platform-specific and can be layered portably on top of an arbitrary C ABI. Accordingly, it is used as the standard C++ ABI for many major operating systems on all major architectures, and is implemented in many major C++ compilers, including GCC and Clang."
  • Itanium C++ ABI (Revision: 1.83)
  • GNU g++: /usr/include/c++/5/cxxabi.h File Reference
  • GCC5 and the C++11 ABI - RHD Blog

1.10.2 Drawbacks and ABI Issues

Drawbacks

  • C++ is unsafe. Bugs like stack overflow, buffer overlow, null pointr exceptions may happen.
  • Operating System Depedent - C++ may be portable, but it is not cross platform since it is compiled to machine code and for a particular operatiing system.
  • Hardware dependent (Processor Architecture) and Operating System Depedent. C++ is compied to machine code / binary code for a particular operating system and processor architecture with different executable formats. The most common processor architectures are Intel x86 (32 bits) and AMD64 (64 bits).
    • OS Windows / Executable Format - PE-32
    • Unix (Linux, BSD …) / Executable Format - ELF
    • Mac-OSX / Executable Format - Mac-O
  • No Standard ABI (Application Binary Interface) - C++ shared libraries and programs compiled with different compilers or different versions of same compiler may be incompatible because unlinke C, C++ doesn't have a standard ABI. It makes hard to call libraries written in C+++ through an FFI - Foreign Function Interface form another programming language such as Python.

ABI Issues - Credits: Defining a Portable C++ ABI - https://isocpp.org/files/papers/n4028.pdf

A C++ developer cannot compile C++ code and share the object file with other C++ developers on the same platform and know that the result will compile and link correctly. Our status quo is that two source files a.cpp and b.cpp can only be linked together if they are compiled with both:" – (Herb Sutter)

  • "the same version of the same compiler, or another compiler with a compatibility mode" (Herb Sutter)
  • "compatible switch settings, since most C++ compilers offer incompatible switch settings where even compiling two files with the same version of the same compiler will not link successfully." (Herb Sutter)

Issues:

  • "It makes sharing binary C++ libraries more difficult: To ship a C++ library in binary form for a given platform requires building it with possibly dozens of popular combinations of switch settings for the popular compiler(s) on that platform, and then may not cover all combinations. Alternatively, one can wrap the library in that platform’s stable C ABI, which brings us to…" (Herb Sutter)
  • "_It is a valid reason to use C: This is (the) one area where C is_ superior to C++. Among programs and programmers who would otherwise use C++, the top reason to use C appears to be the inability to publish an API with a stable binary ABI, including that it can be linked to from C, C++, and other languages’ foreign function interfaces (FFIs) such as Java JNI and .NET PInvoke. In particular…" (Herb Sutter)
  • "_It therefore creates ongoing security problems: The fact that C is_ the only de facto ABI-stable lingua franca continues to encourage type- and memory-unsafe C APIs that traffick in things like error prone pointer/length pairs instead of more strongly typed and still highly efficient abstractions, including but not limited to std::string or the new string_view" (Herb Sutter)

Solutions to ABI compatibility issues

  1. Distribute the library in source format. Approach adopted by QT (former Trolltech Inc, now the QT Company) with open source and commercial license.
  2. Distribute the library in binary format and only support a specific compiler.
  3. Compile he C++ shared library with all possible compilers and distribute the binaries for each compiler, compiler version, processor architecture and operating system.
  4. Write the library in C, instead of C++. This approach is followed by most Unix/Linux libraries and OpenGL and Gtk GUI toolkit.
  5. Use some language that can compile/generate C-code (transpiler).
  6. Use Microsoft COM (Component Object Model)/ DCOM or CORBA, DBUS …

Note: C is until now the only language with a standard and public ABI and most OS exposes its API through a C interface, programming languages runtimes are generally implement in C.

1.11 Reference Cards for shell scripting languages and command line tools

Unix Shell Script

Bash shell script:

ZSH shell script:

Power Shell (Windows-Only)

Command Line Tools

Curl - command line http ftp and other clients.

Httpie - http command line client

Unix Find Command - tool for finding files in disk:

Rsync - tool for fast file transfer and incremental backup

Rename - cli app for bulk file renaming:

Watchexec - executes commands whenever a file changes.

Hexdump:

Mac OSX Brew command line package manager: (Note: Brew can also be used in Linux for installing applications without root Access.)

Android ADB (Android Debug Bridge)

Linux troubleshooting tools:

Radare2 tools

1.12 C++ Resources

Operating System

  • Operating Systems: Three Easy Pieces
    • free online operating systems book! The book is centered around three conceptual pieces that are fundamental to operating systems: virtualization, concurrency, and persistence. In understanding the conceptual, you will also learn the practical, including how an operating system does things like schedule the CPU, manage memory, and store files persistently. Lots of fun stuff!
  • https://manybutfinite.com/
    • Provides lots of information about useful operating systems concepts necessary for better understanding of system programming.

C++ General Resources

C++ Numerical Methods and Scientific Computing

C++ STL - Standard Template Library

C++ ABI - Application Binary Interface, Binary Compatibility and FFI

C-Interface

FFI - Foreign Function Interface

Courses and Online Books

Unix - API / LibC

Embedded Systems

Alternatives to C++

The C++ language is suitable for system programming, writing native applications and writing high performance software components or libraries. However the lack of standard ABI - Application Binary interface makes calling C++ calling a C++ library through FFI - Foreign Function Interface in another language harder.

Due to the C++ ABI issues, many portable libraries that are easier to invoked through a FFI are written in C, for instance, GTK GUI toolkit, …

Selection Requirements:

  • Compile to native code.
  • Have an stable and standard ABI - Application Binary Interface like C.
  • Be able to build shared libraries *.so or *.dll and easily invoked through FFI - Foreign Function Intefaces of high level languages such as Python, Ruby, Java, C# and so on.
  • Be memory safe in order to avoid buffer overflow.

D language

Gambit Scheme

A Scheme implementation that is interactive with a REPL and that can generate C-code and invoke C-libraries. It can be compiled to shared libraries *.so or *.dlls and be called from scheme REPL.

Rust

1.13 C => to C++ Guidelines

  • Malloc - Avoid malloc and manual memory management. Instead of that use new and vector instead of realloc.
  • Pointer - Avoid pointers.
  • Arrays - Use C++ STL vector classes instead of arrays.
  • Strings. Don't use array of characters to represent a string, instead of that use c++ strings by inclunding '#include <string>' header at the top of file.
  • Separate the operating system depedent code from the operating system agnostic code.

1.14 Cross Language Interoperability / Language Bindings - C-API and FFI

Stack Overflow Questions

  • Developing C wrapper API for Object-Oriented C++ code
    • Manual solution: Disadvantage - requires maintaining the C-API and the C++ code.
      • Every object is passed about in C an opaque handle (void* voidpointers).
      • Constructors and destructors are wrapped in pure functions
      • Member functions are pure functions.
      • Other builtins are mapped to C equivalents where possible.
    • Automatic Solution: SWIG Wrapper generator.
      • Disadvantage: SWIG cannot parse all C++ code.

Botan library C-API and language bindings

CXXI: Bridge the C++ and C# Worlds (Non Portable based on GCCXML)

Swig - Wrapper Generator

1.15 Interesting Source Codes

1.16 Computer Archeology and Computer History

1.16.1 General

1.16.2 Historical Videos

Non Categorized

Mechanical and Analog Computers

Mechanical and/or analog computers were single purpose and not programmable, they were used as control systems or for scientific or engineering calculations, specially solving differential equations.

Analog Computer

Meachanical Computers

ENIAC and EDSAC - Earlier Modern Computers

Apolo Computer

The first embedded computer (embedded system) that help take mankind to the moon.

PDP11 - Mini computer

Where the C programming language was born.

UNIX Operating Systems and Mainframes

Object Oriented Programming and Earlier GUI Graphical User Interfaces

The modern GUI and mouse as it is know nowadays were introduced at Xerox Parc in the Smalltalk and later popularized by Apple.

Created: 2021-06-04 Fri 15:09

Validate