C

From Dave's wiki
Jump to navigation Jump to search

Some notes on C and C++.

Stuff to read

Following a tutorial

Learning how to program in C from http://www.cprogramming.com/tutorial/c/lesson1.html

http://en.wikipedia.org/wiki/Procedural_programming

Every full C program begins inside a function called "main" because the main function is always called when the program first executes.

To access the standard functions that comes with your compiler, you need to include a header with the #include directive.

#include <stdio.h>
int main()
{
    printf( "I am alive!  Beware.\n" );
    getchar();
    return 0;
}

The #include is a "preprocessor" directive that tells the compiler to put code from the header called stdio.h into our program before actually creating the executable.

#compile
gcc test.c
a.out
I am alive!  Beware.

A variable of type char stores a single character, variables of type int store integers (numbers without decimal places), and variables of type float store numbers with decimal places.

Use the scanf function to read in a value and the printf function to read it back out

#include <stdio.h>

int main()
{
    int this_is_a_number;
    printf( "Please enter a number: " );
    scanf( "%d", &this_is_a_number );
    printf( "You entered %d \n", this_is_a_number);
    getchar();
    return 0;
}

Compile and run

gcc number.c
a.out
Please enter a number: 23
You entered 23

Programming in C

My notes while reading http://www.amazon.com/Programming-3rd-Edition-Stephen-Kochan/dp/0672326663

Chapter 1

The C programming language was pioneered by Dennis Ritchie at AT&T Bell Laboratories in the early 1970s.

The American National Standards Institute (ANSI) was the organisation that standardised the definition of the C language; in 1983 an ANSI C committee (called X3J11) was formed to standardise C. In 1990, the first official ANSI standard definition of C was published. The International Standard Organisation (ISO) adopted the standard and called it ISO/IEC 9899:1990. The most recent standard was adopted in 1999 and is known as ANSI C99 or ISO/IEC9899:1999.

Chapter 2

The basic operations of a computer system form what is known as the computer's instruction set.

To solve a problem using a computer, you must express the solution to the problem in terms of the instructions of the particular computer. A computer program is just a collection of the instructions necessary to solve a specific problem. The approach or method that is used to solve a specific problem is known as an algorithm.

When computers were first developed, the only way they could be programmed was in terms of binary numbers that corresponded directly to specific machine instructions and locations in the computer's memory. The next technological software advance occurred in the development of assembly languages, which enabled programmers to work with the machine on a slightly higher level. A special program, known as an assembler, translates the assembly language program from its symbolic format into the specific machine instructions of the computer system.

Different processor types have different instruction sets, and because assembly language programs are written in terms of these instruction sets, they are machine dependent.

To support a higher-level language, a special computer program must be developed that translates the statements of the program developed in the higher-level language into a form that the computer can understand - in other words, into the particular instructions of the computer. Such a program is known as a complier.

An operating system (OS) is a program that controls the entire operation of a computer system. All input and output operations that are performed on a computer system are channelled through the OS. The OS must also manage the computer system's resources and must handle the execution of programs.

A compiler is a software program that analyses a program developed in a particular computer language and then translates it into a form that is suitable for execution on your particular computer system.

Syntactic error -> unbalanced parenthesis Semantic error -> use of a variable that is not defined

Source code -> compile -> assemble

The assembler takes each assembly language statement and converts it into binary format known as object code, which is then written into another file on the system. This file typically has the same name as the source file under Unix, with the last letter an "o" (for object) instead of a "c". After the program has been translated into object code, it is ready to be linked. The purpose of the linking phase is to get the program into a final form for execution on the computer. If the program uses other programs that were previously processed by the compiler, then during this phase the programs are linked together. Programs that are used from the system's program library are also searched and linked together with the object program during this phase.

The process of compiling and linking a program is often called building.

The final linked file, which is an executable object code format, is stored in another file on the system, ready to be executed.

An interpreter analyses and executes statements of a program at the same time and are typically slower than their compiled counterparts because the program statements are not converted into their lowest-level form in advance of their execution.

Chapter 3

C++

Notes from SoloLearn. https://softwareengineering.stackexchange.com/questions/113295/when-to-use-c-over-c-and-c-over-c

Data types

The operating system allocates memory and selects what will be stored in the reserved memory based on the variable's data type. The data type defines the proper use of an identifier, what kind of data can be stored, and which types of operations can be performed.

Literals

String literals are placed in double quotation marks. A string is an ordered sequence of characters, enclosed in double quotation marks. It is part of the Standard Library. You need to include the <string> library to use the string data type. Alternatively, you can use a library that includes the string library. The <string> library is included in the <iostream> library, so you don't need to include <string> separately, if you already use <iostream>.

#include <string>
#include <iostream>
using namespace std;

Characters are single letters or symbols, and must be enclosed between single quotes. A char variable holds a 1-byte integer. However, instead of interpreting the value of the char as an integer, the value of a char variable is typically interpreted as an ASCII character.

char letter = 'x';

Integers

To define an integer data type:

int a = 42;

Several of the basic types, including integers, can be modified using one or more of these type modifiers:

  • signed: A signed integer can hold both negative and positive numbers
  • unsigned: An unsigned integer can hold only positive values
  • short: Half of the default size
  • long: Twice the default size

Floating point numbers

A floating point type variable can hold a real number, such as 420.0, -3.33, or 0.03325. The words floating point refer to the fact that a varying number of digits can appear before and after the decimal point. You could say that the decimal has the ability to "float". There are three different floating point data types: float, double, and long double. In most modern architectures, a float is 4 bytes, a double is 8, and a long double can be equivalent to a double (8 bytes), or 16 bytes. Floating point data types are always signed.

Booleans

bool blah = false;

Arrays

An array is a collection of variables that are all of the same type. When declaring an array, specify its element types, as well as the number of elements it will hold.

int b[5] = {11, 45, 62, 70, 88};
cout<< b[3] << endl;
// Outputs 70

An array just big enough to hold the values will be created if the size of the array is omitted.

For multi-dimensional arrays:

type name[size1][size2]...[sizeN];
// 2D array
int x[2][3] = {
  {2, 3, 4}, // 1st row
  {8, 9, 10} // 2nd row
};

Pointers

Every variable is a memory location, which has its address defined. That address can be accessed using the ampersand (&) operator (also called the address-of operator), which denotes an address in memory. A pointer is a variable, with the address of another variable as its value. In C++, pointers help make certain tasks easier to perform. Other tasks, such as dynamic memory allocation, cannot be performed without using pointers. All pointers share the same data type - a long hexadecimal number that represents a memory address.

The asterisk sign is used to declare a pointer (the same asterisk that you use for multiplication), however, in this statement the asterisk is being used to designate a variable as a pointer.

int *ip;  // pointer to an integer
double *dp;   // pointer to a double
float *fp;  // pointer to a float
char *ch;  // pointer to a character

There are two operators for pointers:

  • Address-of operator (&): returns the memory address of its operand
  • Contents-of (or dereference) operator (*): returns the value of the variable located at the address specified by its operand
int var = 50;
int *p;
p = &var;
cout << *p << endl;

Dynamic memory

In a C++ program, memory is divided into two parts:

  • The stack: All of your local variables take up memory from the stack
  • The heap: Unused program memory that can be used when the program runs to dynamically allocate the memory

Many times, you are not aware in advance how much memory you will need to store particular information in a defined variable and the size of required memory can be determined at run time. You can allocate memory at run time within the heap for the variable of a given type using the new operator, which returns the address of the space allocated.

new int;

The allocated address can be stored in a pointer, which can then be dereferenced to access the variable.

// allocated address = new int
// this address is stored in pointer p
int *p = new int;
// assign 5 to the dynamically allocated memory that was created using new int
*p = 5;

We have dynamically allocated memory for an integer, and assigned it a value of 5. The pointer p is stored in the stack as a local variable, and holds the heap's allocated address as its value. The value of 5 is stored at that address in the heap.

For local variables on the stack, managing memory is carried out automatically. On the heap, it's necessary to manually handle the dynamically allocated memory, and use the delete operator to free up the memory when it's no longer needed.

int *p = new int; // request memory
*p = 5; // store value

cout << *p << endl; // use value

delete p; // free up the memory

The delete operator frees up the memory allocated for the variable, but does not delete the pointer itself, as the pointer is stored on the stack. Pointers that are left pointing to non-existent memory locations are called dangling pointers.

The NULL pointer is a constant with a value of zero that is defined in several of the standard libraries, including iostream. It is good practice to assign NULL to a pointer variable when you declare it, in case you do not have exact address to be assigned. A pointer assigned NULL is called a null pointer.

int *ptr = NULL;

Allocate dynamic memory to an array:

int *p = NULL; // Pointer initialized with null
p = new int[20]; // Request memory
delete [] p; // Delete array pointed to by p

Functions

A function is a group of statements that perform a particular task. Every valid C++ program has at least one function - the main() function. A function's return type is declared before its name.

// return type is int
int main()
{
  // some code
  return 0;
}

Occasionally, a function will perform the desired operations without returning a value; such functions are defined with the keyword void. void is a basic data type that defines a valueless state.

Define a C++ function using the following syntax:

return_type function_name( parameter list )
{
   body of the function
}
  • return-type: Data type of the value returned by the function
  • function name: Name of the function
  • parameters: When a function is invoked, you pass a value to the parameter. This value is referred to as actual parameter or argument. The parameter list refers to the type, order, and number of the parameters of a function. Parameters are optional; that is, you can have a function with no parameters
  • body of the function: A collection of statements defining what the function does

You must declare a function prior to calling it. However, a function declaration, or function prototype, tells the compiler about a function name and how to call the function. The actual body of the function can be defined separately. Function declaration is required when you define a function in one source file and you call that function in another file. In such case, you should declare the function at the top of the file calling the function.

For a function to use arguments, it must declare formal parameters, which are variables that accept the argument's values.

void printSomething(int x) 
{
   cout << x;
}

This defines a function that takes one integer parameter and prints its value. Formal parameters behave within the function similarly to other local variables. They are created upon entering the function, and are destroyed upon exiting the function.

Objects

Objects are independent units, and each has its own identity, just as objects in the real world do. An apple is an object; so is a mug. Each has its unique identity. It's possible to have two mugs that look identical, but they are still separate, unique objects.

An object might contain other objects but they're still different objects. Objects also have characteristics that are used to describe them. For example, a car can be red or blue, a mug can be full or empty, and so on. These characteristics are also called attributes. An attribute describes the current state of an object. Objects can have multiple attributes (the mug can be empty, red and large). An object's state is independent of its type; a cup might be full of water, another might be empty.

In the real world, each object behaves in its own way. The car moves, the phone rings, and so on. The same applies to objects - behaviour is specific to the object's type. So, the following three dimensions describe any object in object oriented programming:

  1. Identity
  2. Attributes
  3. Behaviour

In programming, an object is self-contained, with its own identity. It is separate from other objects. Each object has its own attributes, which describe its current state. Each exhibits its own behaviour, which demonstrates what they can do.

Classes

Objects are created using classes, which are actually the focal point of OOP. The class describes what the object will be, but is separate from the object itself. In other words, a class can be described as an object's blueprint, description, or definition. You can use the same class as a blueprint for creating multiple different objects. For example, in preparation to creating a new building, the architect creates a blueprint, which is used as a basis for actually building the structure. That same blueprint can be used to create multiple buildings.

Programming works in the same fashion. We first define a class, which becomes the blueprint for creating objects. Each class has a name, and describes attributes and behaviour. The term type is used to refer to a class name, i.e. creating an object of a particular type. Attributes are also referred to as properties or data.

Method is another term for a class' behaviour. A method is basically a function that belongs to a class. Methods are similar to functions - they are blocks of code that are called, and they can also perform actions and return values. For example, if we are creating a banking program, we can give our class the following characteristics:

  • Name: BankAccount
  • Attributes: accountNumber, balance, dateOpened
  • Behaviour: open(), close(), deposit()

The class specifies that each object should have the defined attributes and behaviour. However, it doesn't specify what the actual data is; it only provides a definition. Once we've written the class, we can move on to create objects that are based on that class. Each object is called an instance of a class. The process of creating objects is called instantiation. Each object has its own identity, data, and behaviour.

Begin your class definition with the keyword class. Follow the keyword with the class name and the class body, enclosed in a set of curly braces. The following code declares a class called BankAccount:

class BankAccount {
// A class definition must be followed by a semicolon.
};

Define all attributes and behaviour (or members) in the body of the class, within curly braces. You can also define an access specifier for members of the class. A member that has been defined using the public keyword can be accessed from outside the class, as long as it's anywhere within the scope of the class object.

Create a class with one public method, and have it print out "Hi".

class BankAccount {
  public:
    void sayHi() {
      cout << "Hi" << endl;
    }
};

The next step is to instantiate an object of our BankAccount class, in the same way we define variables of a type, the difference being that our object's type will be BankAccount.

int main() 
{
  BankAccount test;
  test.sayHi();
}

Abstraction and encapsulation

Data abstraction is the concept of providing only essential information to the outside world. It's a process of representing essential features without including implementation details. A good real-world example is a book: When you hear the term book, you don't know the exact specifics, i.e.: the page count, the colour, the size, but you understand the idea of a book - the abstraction of the book. The concept of abstraction is that we focus on essential qualities, rather than the specific characteristics of one particular example. Abstraction means, that we can have an idea or a concept that is completely separate from any specific instance. It is one of the fundamental building blocks of object oriented programming. For example, when you use cout, you're actually using the cout object of the class ostream. Abstraction allows us to write a single bank account class, and then create different objects based on the class, for individual bank accounts, rather than creating a separate class for each bank account.

Encapsulation is the idea of "surrounding" an entity, not just to keep what's inside together, but also to protect it. In object orientation, encapsulation means more than simply combining attributes and behaviour together within a class; it also means restricting access to the inner workings of that class. The key principle here is that an object only reveals what the other application components require to effectively run the application. All else is kept out of view, a.k.a. data hiding. For example, if we take our BankAccount class, we do not want some other part of our program to reach in and change the balance of any object, without going through the deposit() or withdraw() behaviours. We should hide that attribute, control access to it, so it is accessible only by the object itself. This way, the balance cannot be directly changed from outside of the object and is accessible only using its methods. This is also known as "black boxing", which refers to closing the inner working zones of the object, except of the pieces that we want to make public. This allows us to change attributes and implementation of methods without altering the overall program. For example, we can come back later and change the data type of the balance attribute.

In summary the benefits of encapsulation are:

  1. Control the way data is accessed or modified
  2. Code is more flexible and easy to change with new requirements
  3. Change one part of code without affecting other part of code

Access specifiers are used to set access levels to particular members of the class. The three levels of access specifiers are public, protected, and private. A public member is accessible from outside the class, and anywhere within the scope of the class object.

#include <iostream>
#include <string>
using namespace std;

class myClass {
  // note colon
  public:
    string name;
};

int main() {
  myClass myObj;
  myObj.name = "SoloLearn";
  cout << myObj.name;
  return 0;
}

A private member cannot be accessed, or even viewed, from outside the class; it can be accessed only from within the class. If no access specifier is defined, all members of a class are set to private by default.

Inheritance

Inheritance allows us to define a class based on another class. This facilitates greater ease in creating and maintaining an application. The class whose properties are inherited by another class is called the Base class. The class which inherits the properties is called the Derived class. For example, the Daughter class (derived) can be inherited from the Mother class (base). The derived class inherits all feature from the base class, and can have its own additional features. The idea of inheritance implements the is a relationship. For example, mammal IS-A animal, dog IS-A mammal, hence dog IS-A animal as well.

class Mother
{
 public:
  Mother() {};
  void sayHi() {
    cout << "Hi";
  } 
};

class Daughter : public Mother
{
 public: 
  Daughter() {};
};

The Base class is specified using a colon and an access specifier: public means, that all public members of the base class are public in the derived class. A class can be derived from multiple classes by specifying the base classes in a comma-separated list. For example: class Daughter: public Mother, public Father.

A derived class inherits all base class methods with the following exceptions:

  • Constructors, destructors
  • Overloaded operators
  • The friend functions

Access specifiers are also used to specify the type of inheritance; private and protected access specifiers can also be used here.

  • Public Inheritance: public members of the base class become public members of the derived class and protected members of the base class become protected members of the derived class. A base class's private members are never accessible directly from a derived class, but can be accessed through calls to the public and protected members of the base class.
  • Protected Inheritance: public and protected members of the base class become protected members of the derived class.
  • Private Inheritance: public and protected members of the base class become private members of the derived class.

Compiling

./configure --prefix=/home/dtang
export LD_LIBRARY_PATH=/home/dtang/src/
make
make install

Libraries

If your library name (https://stackoverflow.com/questions/16710047/usr-bin-ld-cannot-find-lnameofthelibrary?rq=1) is say libxyz.so and it is located on path say:

/home/user/myDir

then to link it to your program:

g++ -L/home/user/myDir -lxyz myprog.cpp -o myprog

Print shared object dependencies

ldd $(which samtools)

Links