Software Development With Karl: May 2021

Friday, 21 May 2021

Tips To Improve Software Development

1. Write A Problem Statement
2. Break Problem Into Smaller Manageable Tasks
3. Limit Use of Third-Party Libraries
4. Use Immutable Data
5. Avoid Primitive Obsession
6. Never Use Global Variables

1. Write A Problem Statement

Software is written to solve problems. If you don't fully understand the problem, you are unlikely to solve it correctly. Also, as time progresses you may forget the original problem statement or deviate from solving the original problem. Write (or type) the problem statement and keep it handy. I usually write an abstract text file and store it within the project. This ensures I can visit it frequently to keep the problem statement fresh in my mind.

Top

2. Break Problem Into Smaller Manageable Tasks

This is almost like a to do list. For example, assume you wish to create an app that populates a database with invoice details and posts to a financial system. One way to break this down might be as follows.

Obtain invoice source data
Validate source data
Select/filter relevant data
Populate invoice database
Post invoice details to financial system

Notice that each task is brief, but completing each task will result in a solution. Before attempting a task, break into even smaller tasks until you reach a point where you feel ready to start writing some code. Using the above example, I might break the last task into the following smaller tasks.

Initiate financial application session
Select pending invoices from database
For each pending invoice perform the following

Log invoice details
Validation (e.g. check supplier is valid, that net, VAT and gross tally and so on)
Attempt to post to financial system
Update database to reflect invoice posted or failed to post
Log invoice post details

Tidy up - close any database connections, logoff financial system

Using this approach you can enter tasks and sub tasks into a to do list or spreadsheet and mark as completed as you progress.

Top

3. Limit Use of Third-Party Libraries

Introducing additional dependencies into software also introduces additional failure points. The additional dependency may contain unknown bugs which will ultimately become your bugs.

Typically, when using third-party libraries I will code an API wrapper. I then use my wrapper rather than calling third-party code directly. This has a number of benefits.

If I spot a potential bug I can output trace information or place breakpoints for debugging within my wrapper function.
I can additional code within my wrapper function to fix bugs.
I can wrap multiple third-party library function calls into a single wrapper function.

Top

4. Use Immutable Data

One of the biggest problems with software is complexity. As expected, complexity increases as the software grows in size. State, particularly managing state is probably the biggest contributing factor to complexity.

There are two ways of managing state.

Mutable State

Used in classic object-oriented programming, data is modified in-place. That is the original data is modified.
Immutable State

Used in most (all?) functional programming languages. New data is created by transforming the original data.

As an example, consider a simple Person class that contains first and last names. The following C# snippet illustrates mutable data.

class Person
{
  public string FirstName { get; set; }
  public string LastName { get; set; }
  
  public Person(string firstName, string lastName)
  {
    this.FirstName = firstName;
    this.LastName = lastName;
  }
}

static class Test
{
  public static void Run()
  {
    // Create a new person Fred Bloggs
    Person p1 = new Person("Fred", "Bloggs");
    
    p1.LastName = "Smith";
    // Fred Bloggs has changed to Fred Smith, the original data is lost
  }
}

The following C# snippet illustrates immutable data.

class Person
{
  // Note, set has been removed
  public string FirstName { get; }
  public string LastName { get; }
  
  public Person(string firstName, string lastName)
  {
    this.FirstName = firstName;
    this.LastName = lastName;
  }
}

static class Test
{
  public static void Run()
  {
    // Create a new person Fred Bloggs
    Person p1 = new Person("Fred", "Bloggs");
    
    Person p1New = new Person(p1.FirstName, "Smith");
    // Fred Bloggs remains unchanged, p1New contains new name, "Fred Smith"
  }
}

Using immutable data has the following benefits

Thread safety

Multiple threads can read data without using synchronisation. Immutable data is read only so data integrity is assured. If data must be modified (i.e. mutable) then thread synchronisation is required to ensure data integrity.
Good key elements

Certain data structures (dictionaries, hash sets, most tree structures, and so on) require keys to access values. Once inserted, the key should never change. Changing the original key will result in the original value being unobtainable.
Easier to reason about code

Functions tend to be free from side-effects as input parameters cannot be modified. Class invariant is established once (during construction) and remains unchanged throughout the duration of the program.
Easier to code

Immutable data is contructed once and cannot change during the lifetime of the program. Functions (and associated logic, validation, etc) to modify existing data are not required.
Improved caching

Caching is used when data creation is relatively expensive. Examples include reading configuration data, reading from a database and so on. In these scenarios you would expect the same data each time you ask the cache for a particular item.

Top

5. Avoid Primitive Obsession

Primitive obsession is where language primitives are used to describe a data type that is more complex than the primitive describing it. Most data types requires constraints. Examples include...

A person's age is typically between 0 and 100. (Could use upper value of say 150 to be sure)
An email address, as a minimum contains a @ character and a domain suffix (.COM, .NET, etc)
A Bank Account Number only contains digits
A unique database table row identifier is typically an integer (e.g. customer id)
An IPv4 address contains four, three digit integers.

Referring to the above examples. Using primitives,
1,3 and 4 might use integers, the remainder would likely use strings.

Consider a customer data type that may be persisted to a database consists of a unique id, first and last name and an email address. Using primitive data types we might code this as follows...


class Customer
{
  public int Id { get; }
  public string FirstName { get; }
  public string LastName { get; }
  public string Email { get; }
  
  public Customer(int id, string firstName, string lastName, string email)
  {
    this.Id = id;
    this.FirstName = firstName;
    this.LastName = lastName;
    this.Email = email;
  }
}

There are several problems with this code.

It is possible to inadvertently mix email with first or last name as they are all strings.
There is no size constraint on the name fields whereas there will be on the database side.
The email address could be supplied an invalid value.
Any of the string fields could be supplied with null or zero-length values.
The id field is a signed integer whereas unique identifiers tend to be positive integers.
There is no distinction between a customer id and other unique identifiers that we will likely require.

string email = "fredbloggs@hotmail.com"; string email2 = "test.exe"

The first email address might well be valid. The second is most definitely not a valid email address. However, using a string to represent an email address does not tell us if the email address is well-formed. The email data type is more complex than the string describing it.

A better way to model an email address might be as follows.

public class EmailAddress
{
  public string Value { get; }
  
  public static EmailAddress Parse(string value)
  {
    string input = string.IsNullOrEmpty(value) ? "" : value.ToLower();
    // validate value, throw exception if supplied value is not a valid email address
    return new EmailAddress(value);
  }
  
  private EmailAddress(string value)
  {
    this.Value = value;
  }
}

The above code implements a simple email address type. In reality more would be needed, such as testing for equality, hash code generation and so on. Also, I use a static function (Parse, in this example) to validate input before creating the type. The constructor is made private to ensure that the email type can only be constructed using the static creation functions (parse in this case). The following code is what I might typically use. This example implements the age type

public class Age
{
  // Exception specific to the Age class. Aids in debugging and logging.
  public class InvalidValueException : Exception
  {
    public InvalidValueException(string message) : base(message) { }
  }

  // Value will always contain a valid age due to smart construction and private constructor
  public byte Value { get; }

  // Smart construction validates age range.
  // Returns a valid age or throws an Age.InvalidValueException
  public static Age Create(int value)
  {
    if (value < 0 || value > 120)
      throw new InvalidValueException($"Age should be a value between 0 and 120");
    return new Age((byte)value);
  }

  // Smart construction validates parses age from a string.
  // Returns either a valid age or throws an Age.InvalidValueException
  public static Age Parse(string value)
  {
    int result = 0;

    if (!int.TryParse(value, out result))
      throw new InvalidValueException($"Unable to convert string, {value} to age");
    return Create(result);
  }

  private Age(byte value)
  {
    this.Value = value;
  }
}

Top

6. Never Use Global Variables

Did I say never? Generally (read almost always) speaking, global variables are bad, really bad, like, don't ever use them. The reason is simple. Unprotected global variables can be modified, anywhere, within the software. This makes tracking changes impossible. Of course data has to start somewhere, which, generally means that you might have one global variable. This single global variable typically contains the global state as used by the app.

A better idea is to have a bootstrap function that creates initial state and pass that on to the main app. If this is not possible then you should wrap the initial state and supply get and set accessors. This will at least ensure that you can add trace, breakpoints to anything that modifies the global state.

If you follow one of the strongest rules, always ensure functions receive dependencies as parameters, using global variables isn't such a big deal. What is a big deal, and what will come back to bite you as a huge problem is using global variables without passing as parameter(s).

Consider a simple example (in C#).

class Globals
{
  public static int ThisIsGlobal { get; set; } 
}

class Test
{
  public static void Func1()
  {
    Globals.ThisIsGlobal = 1;
  }
  
  public static void Func2()
  {
    Globals.ThisIsGlobal = 2;
  }
}

A somewhat contrived example, I agree. That said, I hope it proves a point. Throughout the duration of the software, Func1 or Func2 may be called. Both are bad functions in my opinion as firstly, they do not accept Globals as an input parameter. Secondly, both functions produce side-effects. The both modify the global state.

I probably need to, and will write more about this. I will say this, if you have global variables, have only one (it can be a structure a class, etc). Wrap the global variable so that potentially, if needs be you can log acess/modifications to said global variables. Better still, have a bootstrap file that creates initial state conditions and that state is passed to the main controller. The main controller will be a UI element in a UI-driven app, or an entry function in a non-UI app.

Top

7. Release First Draft ASAP

Releasing the first draft allows all stakeholders to determine if the software is on the right track. If it is great, move on to the next feature. If not, this is a great chance to redefine the original problem statement.

In the early days of software development, it was commonplace to write pages and pages of text that described the overall software purpose. Not only does this not work, my own experience shows that this approach also doesn't work. The reason why this doesn't work is that software requirements are not only complex but they also evolve as both stakeholders and engineers realise the system.

The initial release, prototype, if you will, is great for all involved to give an opinion. Again, in my experience, this is where the real requirements come about. So many times I have been given a set of requirements, delivered a basic solution, only to find that the final deliverable is not what the customer wanted. What a customer thinks they want and want they actually want, in my experience, are completely different.

I guess this is why we call this software and not hardware. Hardware is set in stone, cannot be changed. Software on the other hand...

Top

8. Release Often

Software design used to involve many drawing, static drawing, runtime analysis diagrams and so on. Basically design as much up front as possible, then, go away and code. The problem, gathering the requirements could take months at best. Knowing a little about human nature and a lot about customers. Customers never really know what they want, they have ideas as to what they want. This is not a bad thing it just means that pages and pages of requirements will never work.

So what is the answer. Regular deliverables. This allows stakeholders/customers to see the software in situ and see the software grow. Potential problems can be weeded out. You can never get this level of clarity in a written document. Sometimes, people just need to see!

Releasing software on a regular basis is also a good idea. You/team might spend a week developing a new feature (the outline only I hope). You then present this to the stakeholders. If they like, great, you can some question, get some depth, go away an implement a fill solution. If they hate it, you may have wasted 5 days development time. Not a big deal in the scheme of things. Even a failure is likely to bring out features. Sometimes when people see what they don't like, they have like a eureka moment and realise what they would like.

Top

9. Always Supply Function Dependencies

For maintainable software this a huge rule for me. Always supplying dependencies to function achieves the following

The function will (should) produce the same output given the same inputs
A (pure) function can only produce an output based upon its input

As an example, assume a function, SaveCustomer, that simply takes a customer object. What does save a customer really mean. Save to a database, save to a CSV file, XML file and so on.

In software some concepts are truly open ended. The concept of saving customer details is an example. Let's add code to see why.


class Customer
{
  public CustomerId Id { get; }
  public string FirstName { get; }
  public string LastName { get; }
}

Assume that the above code contains a single constructor where one specifies the id, first and last name. Now assume that we have a class, database that contains a function saves customer details. The class might look as follows


class Database
{
  public static void Save(Customer customer)
  {
    // code to save customer to a database
  }
}

You may even have a top entry class. I tend to do this and usually call the class App. My app class usually acts as a kind of gateway, even an API to allow other parts of my software to call important code. As an example I might have something like the following.


static class App
{
  private static readonly Database _db = new Database("connection string");
  
  public static void Save(Customer customer)
  {
    using (var trans = _db.Open())
    {     
      _db.Save(customer);
    }
  }
}

Of course, in the above code, the database is slightly different to the one I originally presented. My point was that at the top-level, not supplying dependencies is OK. In the above example, I don't supply the concrete database class. I specify this as a class static.

In summary, non-top level functions should always explicitly state inputs. Top-level functions can ignore this rule and create dependencies within the function body. This should only ever occur at top-level functions.

Top

Refactor Regularly

As a project progresses, you will most likely start to think that the original code leaves much to be desired. This has happened to me often (even though the code works and works well). This is not the end of the world, you simply need to refactor your code. Refactoring is similar to writing code from scratch using new knowledge. The real difference is that you will be replacing existing code with new code, possibly using new lessons learned--------------------------------------.

Following a release, I think it always worthwhile refactoring the existing code.

Use Higher-Order Functions

Use Interfaces For Varying

Treat Exceptions And Anticipated Errors Differently

Favour Composition Over Inheritance

Saturday, 15 May 2021

Blogger.Com Tips

If you use Blogger.com for content I have a few tips that may help develop the website you really want.

Why use Blogger.com

To date, blogger.com is free to use. You don't even have to buy a domain name! There are some shortcomings as always with free stuff.

Layout is kind of fixed, you can change for certain templates, but is limited.
Colour schemes seem to be adhoc, I tried selecting dark gray on the simple theme and the overall colour was red.
Link management is hard work. Google Websites manages links so much better.
A ton of additional resources, JavaScript, CSS etc are required for each page. This tends to increase page loading times.

Of course, there are some benefits.

Editor allows for HTML or compose view.
Can add you own CSS to style certain elelments as you see fit. (Thinking paragrapgh tags here).
Easy to integrate Google Adsense.

Blogger.com Tips

I am quite new to Blogger but the following are tips that may prove worthwhile.

Bookmarks

Bookmarks in the web world is a mechanism that allows viewers of your site to jump to in-page content. Bookmarks are basically in-page navigation. To create a bookmark you simply need to create an HTML element with a unique id. In this post I created an A element which simply contains a unique id.
```
<a id="top"></a>
```
Notice how the A element only contains an id. The id must be unqiue to the page (not the entire site). I can now link to the new bookmark, again using simple HTML code. Click here to use my bookmark.

Basically, a bookmark is an HTML element with a unique id (unique to the page, not the website). To reference a bookmark, simply use the the anchor (A) element and set the href property to the unique id preceded with a #. To elaborate.
```
<h3 id="Bookmark">My heading</h3>
```
The above code declares a bookmark. The id property is what A elements will use when declaring an inline jump to content.
```
<a href="#Bookmark">My heading</h3>
```
Notice how for the bookmark reference the id is preceded by a hash (#). Anyone clicking on the My Heading link will be taken to the HTML element containing the id.
Paragraphs

Been a while, but I do remember that in school, when learning English, the first line of a paragraph is always indented. This can be achieved in the "Compose view" editor. Simply use the tab key to insert a tab.

The problem is that a tab will result in several non-breaking spaces (&nsbp;). This is far from ideal. A better way is to use CSS to state that all paragraphs will indent the first line.

Fortunately, Blogger allows one to add custom CSS. Simply navigate to theme, then select customise. Select advanced and navigate to the bottom where you should see Add CSS. The following CSS will indent all paragraphs.
```
p {
 text-indent: 10px;
}
```
The value 10px can be changed to suit your own needs.

How To Host A Website Using Windows 10

Learn how to host a simple website on Windows 10 using Internet Information Services (IIS). The tutorial will show how to install and setup IIS and how to host a single page website. The tutorial lays the foundations for building more complex websites.

Installing IIS
Create simple, one page website
- Create website directory
- Create a simple webpage
Setup IIS to host new website
- Create a virtual directory
- Test website

Installing IIS

Installing IIS is trivial, just follow the steps below.

Find and run feature selection

Click the Windows start icon (usually bottom leftmost) and begin typing features. The best match window should appear. Select Turn Windows features on or off. The following image illustrates the idea.

Top

Select features

You should now see the Windows features popup. Scroll down until you see Internet Information Services. Select Internet Information Services. You can accept the defaults and click OK to install. Installation should take less than a minute. The following image shows the default settings on my system.

Top

Check installation

Assuming IIS installed correctly you now have the ability to host your own website. Hosting your own website allows you to try out ideas before committing online. It also allows you to try out different web technologies, PHP, web forms and so on. To test if IIS is up and running on your system try the following.

Open your favourite web browser.
In the address bar type localhost
If all is well you should see a page similar to the following.

Top

Create simple, one page website

The basic test for this step is to see if we can create a simple web page that can be viewed using a web browser. To create a simple website we will create a new directory, add a HTML page and modify IIS settings to ensure that we can view the new web page. Complete the following steps to achieve this.

Create website directory

IIS obtains content from a fixed drive. You can use an external drive, but, for performance a fixed drive is best. Each website you create will typically be created create a new directory. For this example I will create a new folder named, MyWebsite under my root drive, C. The following image illustrates the idea.

Top

Create a simple webpage

Webpages come in all shapes and sizes and may deliver static or dynamic content. For this example we will create a simple, static HTML page. To create the HTML webpage you need a text editor. Notepad will suffice for demonstration purposes, though I wouldn't recommend for more complex websites.

Open notepad, you can do this by selecting the Windows 10 Start menu and typing in notepad. Enter the following text.

<p>Hello, from my new website</p>

Now select save as from the file menu. Be sure to select All Files and then type index.html as the file name. The following image illustrates the idea.

Top

Check progress

At this point you should have a folder, MyWebsite under your root drive (I am using C drive in this example. You should also have an index.html file in the MyWebsite folder. The following image illustrates what you should have.

You should be able to view the index.html file in your default browser. To try, use file explorer to navigate to to C and then to MyWebsite Then select, double-click index.html under the MyWebsite folder.

Top

Setup IIS to host new website

In the previous section we created a simple HTML webpage. Now, we need to let IIS know where our website lives. There are a number of ways to achieve this, for now we will keep it simple and create a virtual directory.

Top

Create a virtual directory

A virtual directory is an alias for a physical directory (i.e. located on your hard disk). A virtual directory becomes part of the web application's URL. For example, the default IIS installation contains a simple web application. You can view the web application by opening a browser and navigating to localhost. We will add a new virtual directory, mysite to the IIS web application. This will allow us to navigate to our simple website using localhost/mysite To do this, perform the following.

Open up IIS (you can do this by typing IIS after selecting Windows 10 Start icon).
Expand the top item under connection until you can see Default Web Sites.
Right click Default Web Sites and select Add Virtual Directory.
Enter website name under alias and browse to where your index.html file resides. The following image illustrates the result on my system.

Top

Test Website

You can now test your new website. Open your favourite web browser, enter localhost/mysite into the address bar and go (pressing enter usually does the trick). If all has gone well you should see the text, Hello, from my new website. The following image illustrates the result on my system using MS Edge.

Friday, 14 May 2021

Functional Programming

Functional programming, as one might expect, uses functions to develop software. Functions are a primary construct, and can be passed to other functions, composed, chained and so on. Functional programming is fairly straightforward if you follow some guidelines.

What is functional programming?

Functional programming places functions first. All aspects of the resultant software is built upon functions. In contrast, Object-Oriented development places objects first. In this case, objects encapsulate state and potentially, behaviour. In functional programming, data and behaviour are typically kept separate. In my experience, the two techniques can be summarised as follows.

Object-Oriented

Well-suited for user interface development as widgets (e.g. buttons, text input, etc) can maintain state such as text, event handlers for when say a button is clicked or text is editied.

Data types such as stacks, queues, dictionaries, etc. Managing state and exposing behaviour in single place makes sense.
Functional

Better suited where partitioning state across different objects is non-sensical or becomes problematic.

Data types are complex, for example invoice processing, game state and so on. More complex data types are better manipulated using functions rather than "attaching" functions to the data type (Functions attached to a data type are typically known as methods in Object-Oriented development).

Top

What is a function?

A function accepts one or more inputs and generates a single output. It is also possible for a function to accept zero inputs and just return a value. The main take should be that a function can only create an output based upon the input values supplied. This is important! What this means is that a function cannot use outside state (global state) to generate an output. A function that only uses supplied inputs to generate an output is known as a Pure Function

Top

How do I code a good function?

To create a solid, sound function that can be reasoned about ceratin rules must be followed.

Use pure functions

As mentioned, a pure function generates an output based purely upon its inputs. Using global state is forbidden. Pure functions go a step further, executing code that produces any side-effects is forbidden. This includes writing to the console, writing to log files, updating a database. In short, a pure function should always produce the same output given the supplied inputs. No side-effects, no using global variables. In pure functional programming even throwing exceptions is no go. Makes sense as throwing an exception is a side-effect which goes against the notion of a pure function.

An exception is really just a goto on steriods. Exceptions can make following program flow and debugging difficult. Well-written software shouldn't need debugging. Sure, debugging a new function whilst still in development is likely a must. Once the function has been tested you should be able to trust it and move on.

There are ways to implement state changes whilst still adhering to the rules. I shall go into details later,
Use immutable data

Mutable data is data than can change in place. I remember my C/C++ days when dealing with strings. It was common to modify existing strings. This is more performant than creating new string based upon old strings.

Times have changed, as have data structures, memory and so on. Immutable data is now the way forward. This essentially means once you have created some data, an object, record whatever your language permits, it never changes. If you need different values, you create new data, possibly based upon the original data.

Take some simple data, the data represents X and Y coordinates in 2D space. Using C#, the code to represent the data is as follows.
```
public struct Point
{
  // get and set are C# constructs that allow one to set or get a data value.
  // get allows one to read a data value.
  // set allows one to update (write to) a data value.
  public int X { get; set; }
  public int Y { get; set; }

  public Point(int x, int y)
  {
    this.X = x;
    this.Y = y;
  }

  public void Offset(int xOffset, int yOffset)
  {
    this.X += xOffset;
    this.Y += yOffset;
  }
}
```
Now assume the following test class
```
public static class TestPoint
{
  public static void Test1()
  {
    Point pt = new Point(20, 40);
    pt.X = 1000;
    pt.Offset(0,20);
    // pt now contains the values x=1000 and y=60.
  }
}
```
Top
In the above code, pt was created to have an X value of 20 and a Y value of 40. The second statement assignes 1000 to X. This directly modifies the pt variable. This is known as in-place modification. The ability to modify data content in-place makes the data structure mutable. The code also contains an Offset method. This method adds x and y offsets to the existing data.

How do I refactor mutable code to immutable code?

In this case, the transition from mutable to immutable code is easy. Make all data fields readonly and supply a constructor to initialise the data fields. The following code example illustrates the idea.


public struct Point
{
  // A field that contains only a get construct is deemed to be readonly.
  // The field value may only be assigned via the constructor.
  public int X { get; }
  public int Y { get; }

  public Point(int x, int y)
  {
    this.X = x;
    this.Y = y;
  }

  public Point WithOffset(int xOffset, int yOffset)
  {
    return new Point(X + xOffset, Y + yOffset);
  }
}

public static class TestPoint
{
  public static void Test1()
  {
    // Create a pt variable as per previous example.
    Point pt = new Point(20, 40);
    
    // Unable to modify the pt variable as fields are readonly.
    // We need to create a new data, i.e. a new instance of Point.
    // We create a new point by adding 1000 to the current X and zero to the Y value.
    Point ptNew = pt.WithOffset(1000, 0);
  }
}

Notice how the new code doesn't quite behave in the same way as the previous code. In the previous code we directly set the X value to a 1000. To achieve the same result in the new, immutable data we need to add a new function, let's call it WithX.

public struct Point
{
  // A field that contains only a get construct is readonly.
  // The field value may only be assigned via the constructor.
  public int X { get; }
  public int Y { get; }

  public Point(int x, int y)
  {
    this.X = x;
    this.Y = y;
  }

  public Point WithOffset(int xOffset, int yOffset)
  {
    return new Point(X + xOffset, Y + yOffset);
  }
  
  public Point WithX(int x)
  {
    return new Point(x, Y);
  }
}

Subscribe to: Posts (Atom)