Session data is evil

I’ve been working in ASP.MVC recently after working in Java for a long time. One of the things that struck me was the common use of session data in web application.

Now I know that people can and do use abuse sessions in Java, but the default routing and ease of access make using it more tempting in ASP.MVC. The standard routing convention of “/Controller/Action/:id” means you need to explicitly code to use RESTful paths that give you multiple IDs in URLs like “order/2/item/3” for non-trivial scenarios, and out of the box convenience methods like “TempData” seem to offer magical persistence between requests. These incentives combine to make using session data the path of least resistance in ASP.MVC.

1487ta

Any data stored in session is inherently unreliable and use of it makes load balancing and scaling your application much more difficult. Once you use it, each instance of your web application must be able to find the users session data to reliably handle requests. Since it’s now extremely common to use multiple instances even for small applications (irresponsible not to for disaster recovery and redundancy) you will need to think about this before you deploy into production.

It also adds hidden complexity to testing your application. Each endpoint which relies on state stored in session needs to be tested with that application state simulated. This means you have at least two places in your code defining and using the same semi-structured data, which makes your tests complex/fragile and your code harder to maintain.

Stelhi_Silk_Mill_Lanco_broken_windows

Once you make using session data part of your architecture it’s very hard to refactor and remove it. That little innocent use of TempData to store details from the last request will spread as Developers think “If it was ok there then it’s ok here…” and “one more can’t hurt” (the broken windows theory). Now your user flows in the web application rely on session stored details to go from screen A to B to C, and refactoring them means re-writing and testing a lot of the view/controller logic to replace the data held in session.

There are acceptable uses for session data in web application, authentication is the obivous one. What they have in common is having alternative flows to cope if session data is not found without breaking functionality.

If you have an over reliance of session in your application you are making a flakey, hard to scale and maintain application that will at best limp into production. At worst it will fall over and take your users data with it.

There are common patterns and methods to avoid needing session data, below are some links to help:

Design for Devs – Change sequence diagrams

I’ve been asked a few times by junior developers how to get started in designing code, as if it’s some sort of art technique. In truth every developer is doing design, no one spontaneously writes complex formal logic. Most just do it in their head based on experience and patterns they’ve used before. For small well understood problems this is fine, but when you are dealing with larger or more complex changes doing a little design work up-front can really help clarify things and ultimately save time while giving a better solution.

I’m writing this to document a simple type of design I’ve used on many projects, a change sequence diagram, one you can do quickly on paper or on a whiteboard in ten minutes and I’ve found to be very helpful in thinking about what changes are required, the size of the overall change and promoting re-use of existing code.

Here’s an example:

change-sequence-diagram
It’s a pretty simple variation of a sequence diagram, where you show the sequence of events which should occur as a series of interactions between the involved people/components. It normally starts with person, like someone clicking a link on web page, then shows how the system responds. The change part is about highlighting what components in each part of the system need to change to handle the functionality, what parts need to be added/updated/removed.

Doing this forces you to think up-front about what you will need to change and how the system will interact to get the end result. It ties the individual component changes to the overall user requirement, e.g. you’re not just adding a new database column and view field, you’re adding them so the user can see and update their middle name on the personal details screen. This helps you understand how the parts of your system interact and use consistent implementations and design patterns in your changes, plus identify the unit tests and test scenarios.

When you are done, the number and type of changes shows the scale of the overall change, useful for estimates, and breaks it down into manageable chunks of work. You’ll get the best results if you do it paired with someone or get someone else to review your design. Doing this checks that you aren’t breaking existing patterns in the code or missing something that could increase or decrease the complexity. You can expand it to include alternate flows and consider NFR’s for security and performance.

Next time you’re looking at a new requirement or user story give this a try, you’ll be surprised how easy it is to do and what you’ll get out of it.

The incremental complexity trap

This is a common problem which developers face, particularly in agile projects when it’s normal to make user stories related to previous user stories to add new functions to existing screens and flows.

It happens all the time. You start with a something like a simple screen and do the necessary work to make it. Then you get a few new requirements, add some more fields and new validation for alternate user flow etc. You do the same, add the fields to your screens and logic to the controllers. Then it happens again, then again and again. More user flows are added, more fields appearing in some of those flows, more complex validation. Your original simple screen, controller and validation logic is now a monster, unmaintainable and a nightmare to debug.

Often teams don’t do anything to solve it, just live with the problems. Commonly by the time they realise what’s happening it’s easier just to keep layering on the complexity rather than deal with it properly. No one wants to be holding the bag when it’s time to stop and refactor everything.

This can be particularly bad if it’s happening in multiple places at the same time during a project. The effects of shortsighted decisions start to snowball, affecting development speed and increasing regression issues, and it’s not feasible to refactor everything (try selling a sprint full of that to a Product Owner).

So how do you avoid the trap?

  • Keep it simple

Establish code patterns that encourage separation of concerns, make the team aware of them and how to repeat the patterns.

  • Unit tests

Test complexity helps highlight when things are getting out of hand before it’s too late and make it much easier to refactor by reducing the chance of regression issues.

  • Anticipate it before it happens and design

This is the job of the Technical Architect, to know what requirements and stories are coming and how they will be implemented. If an area is going to get a lot of complexity it needs to be handled or you will end up in the trap.

What can you do if you are already in it?

  • Stop making it worse

For complex classes stop adding new layers of logic. Obvious, but the temptation will be there. Stop and plan a better approach, as the sooner you do this the easier it will be.

  • Don’t try for perfection and refactor everything

Big bang refactors are risky and time consuming. Take part of the complexity and split it out, establishing a pattern to remove more.

  • Focus on your goals

Overly complex classes aren’t bad out of principle. They are bad because they slow development and give bugs places to hide. Refactoring and creating a framework to add new functionality can speed development and reduce occurance of defects, which also eat time. You should focus your effort on refactoring areas which will need more complexity later, investing time now to save more later.

ASP.MVC Datatables server-side

This is an example implementation of JQuery Datatables with server-side processing. The source is here.

datatables

Introduction

JQuery Datatables is a great tool, attach it to a table of results and it gives you quick and easy sorting/searching. Against a small dataset this works fine, but once you start to have >1000 records your page load is going to take a long time. To solve this Datatables recommend server-side processing.

This code is an example of implementing server-side processing for an ASP.MVC web appliction, using a generic approach with Linq so that you can re-use it for different entities easily with little code repetition. It also shows an implementation of full word search across all columns, which is something that the Javascript processing version offers but is very tricky to implement on the database side with decent performance. It’s a C# .NET implementation but you can take the interfaces and calls from the controllers and convert the approach for Java or Ruby (missing the nice Linq stuff tho).

Details

I’ll skip the basic view/js details as that is easily available on the datatables documentation.

The request comes into the controller as a GET with all the sort/search details as query parameters (see here), it expects a result matching this interface:

public interface IDatatablesResponse<T>
{
    int draw { get; set; }
    int recordsTotal { get; set; }
    int recordsFiltered { get; set; }
    IEnumerable<T> data { get; set; }
    string error { get; set; }
}

The controller extracts the parameters, creates the DB context and repository and makes three calls asynchronously:

  • get the total records
  • get the total filtered records
  • get the searched/sorted/paged data

The data is returned and Datatables Javascript uses it to render the table and controls for the correct searched/sorted/paged results.

The magic happens in the DatatablesRepository objects which handle those calls.

DatatablesRepository classes

Interface:

public interface IDatatablesRepository<TEntity>
{
    Task<IEnumerable<TEntity>> GetPagedSortedFilteredListAsync(int start, int length, string orderColumnName, ListSortDirection order, string searchValue);
    Task<int> GetRecordsTotalAsync();
    Task<int> GetRecordsFilteredAsync(string searchValue);
    string GetSearchPropertyName();
}

The base class DatatablesRepository has a default implementation which provides generic logic for paging, searching and ordering an entity:

protected virtual IQueryable<TEntity> CreateQueryWithWhereAndOrderBy(string searchValue, string orderColumnName, ListSortDirection order)
{
    ...
    query = GetWhereQueryForSearchValue(query, searchValue);
    query = AddOrderByToQuery(query, orderColumnName, order);
    ...
}

protected virtual IQueryable<TEntity> GetWhereQueryForSearchValue(IQueryable<TEntity> queryable, string searchValue)
{
    string searchPropertyName = GetSearchPropertyName();
    if (!string.IsNullOrWhiteSpace(searchValue) && !string.IsNullOrWhiteSpace(searchPropertyName))
    {
        var searchValues = Regex.Split(searchValue, "\\s+");
        foreach (string value in searchValues)
        {
            if (!string.IsNullOrWhiteSpace(value))
            {
                queryable = queryable.Where(GetExpressionForPropertyContains(searchPropertyName, value));
            }
        }
        return queryable;
    }
    return queryable;
}

protected virtual IQueryable<TEntity> AddOrderByToQuery(IQueryable<TEntity> query, string orderColumnName, ListSortDirection order)
{
    var orderDirectionMethod = order == ListSortDirection.Ascending
            ? "OrderBy"
            : "OrderByDescending";

    var type = typeof(TEntity);
    var property = type.GetProperty(orderColumnName);
    var parameter = Expression.Parameter(type, "p");
    var propertyAccess = Expression.MakeMemberAccess(parameter, property);
    var orderByExp = Expression.Lambda(propertyAccess, parameter);
    var filteredAndOrderedQuery = Expression.Call(typeof(Queryable), orderDirectionMethod, new Type[] { type, property.PropertyType }, query.Expression, Expression.Quote(orderByExp));

    return query.Provider.CreateQuery<TEntity>(filteredAndOrderedQuery);
}

The default implementation for creating the Where query (for searching) will only work if you provide a SearchPropertyName for a property that exists in the database that is a concatenation of all the values you want to search in the format displayed.

You can implement and override to use a custom method if your Entity does not support this, here is an example from the Person Entity:

public class PeopleDatatablesRepository : DatatablesRepository<Person>
{
    ...
    protected override IQueryable<Person> GetWhereQueryForSearchValue(IQueryable<Person> queryable, string searchValue)
    {
        return queryable.Where(x =>
                // id column (int)
                SqlFunctions.StringConvert((double)x.Id).Contains(searchValue)
                // name column (string)
                || x.Name.Contains(searchValue)
                // date of birth column (datetime, formatted as d/M/yyyy) - limitation of sql prevented us from getting leading zeros in day or month
                || (SqlFunctions.StringConvert((double)SqlFunctions.DatePart("dd", x.DateOfBirth)) + "/" + SqlFunctions.DatePart("mm", x.DateOfBirth) + "/" + SqlFunctions.DatePart("yyyy", x.DateOfBirth)).Contains(searchValue));
    }
}

The same is true of the order by query, which may need customisation to sort correctly for data, i.e. dates. Here is an example from the PersonDepartmentListViewRepository, which replaces the formatted date column being formatted with the raw date:

public class PersonDepartmentListViewRepository : DatatablesRepository<PersonDepartmentListView>
{
    ...
    protected override IQueryable<PersonDepartmentListView> AddOrderByToQuery(IQueryable<PersonDepartmentListView> query, string orderColumnName, ListSortDirection order)
    {
        if (orderColumnName == "DateOfBirthFormatted")
        {
            orderColumnName = "DateOfBirth";
        }
        return base.AddOrderByToQuery(query, orderColumnName, order);
    }
}

Using a view will make life much easier, as the data can be pre-formatted and you can supply a search column to do the full word searching, here’s the view I used to combine results from two tables:

CREATE VIEW [dbo].[PersonDepartmentListView]
AS
SELECT dbo.Person.Id, 
dbo.Person.Name, 
dbo.Person.DateOfBirth,
CONVERT(varchar(10), CONVERT(date, dbo.Person.DateOfBirth, 106), 103) AS DateOfBirthFormatted,
dbo.Department.Name AS DepartmentName,
CONVERT(varchar(10), dbo.Person.Id) + ' ' + dbo.Person.Name + ' ' + CONVERT(varchar(10), CONVERT(date, dbo.Person.DateOfBirth, 106), 103) + ' ' + dbo.Department.Name AS SearchString
FROM  dbo.Person INNER JOIN
       dbo.Department ON dbo.Person.DepartmentId = dbo.Department.Id

Notes

  • If you are displaying date values be aware that you will need to format the date for display before returning in JSON, and the date format will affect how you sort the column on the backend as you will need to identify the actual date column property rather than the formated string
  • For effort and performance you are better off creating view than using complex Linq queries
  • I created the initial example with the help of Stephen Anderson

Play compile error – java.util.NoSuchElementException: Either.left.value on Right

Got this mysterious error when trying to build a scala play solution containing multiple projects (common projects which the play projects had dependencies on):

[error] {file:/mnt/jenkins/workspace/myproj/}mysubproj/compile:sources: java.util.NoSuchElementException: Either.left.value on Right

Turned out the problem was I hadn’t specified the scala version for the common projects, so they were being built with scala 2.9.2 while the play projects tried to build with 2.9.1, causing the issue.

Fixed it by specifying the scala version for all projects in my Build.scala:

scalaVersion := "2.9.1"

Windows XP IE6/IE7 cannot use SSL when hosted on servers with multiple SSL sites

Users reporting SSL certificate errors? They’re on Windows XP IE6/IE7? Other users not seeing the same issues, even on Vista IE7?

Confused? I was. Turns out the answer is pretty simple and well known, IE is crap. Specifically IE7 and lower on Windows XP does not support SNI (Server Name Indication).

This means that if you are hosting your HTTPS site on a server with other sites using HTTPS (port 443) IE will not tell your web server which subdomain to look for when requesting the certificate (e.g. give me the cert for  mysite1.com versus mysite2.com, like it does when making the normal request). So your web server returns the first one under port 443 that matches, because it has nothing to id the target configuration. If this is the incorrect cert it won’t match the domain and IE will give an SSL error.

There’s no smart way around this, nothing can force the client to send the right info in the certificate request, so if you want your SSLs to work for older machines you need to split out your sites onto different machines.