Architecting Sharedove Conference Organization Solution: Data First (Part 1)

In the previous post of the Sharedove Architecture Project series, I have described what we are going to build here in the next few months – a business solution, based on .NET and Microsoft SharePoint, which will be used for the conference organization tasks. I have also stated the reasons why do I believe that SharePoint is a right platform for doing it.

In this post, we will actually start architecting the solution. According to authors of the "Documenting Software Architectures: Views and Beyond" book, the software architecture of a system is the set of structures needed to reason about the system, which comprise software elements, relations among them, and properties of both. This basically means that there is a need to describe all of the aspects and elements of a software system, and all the relations between them. This includes, for example, business data, business processes, user interface, various application interfaces, and many other aspects of a complex business solution. If we don’t do that, there is a good chance that the solution development will go in a direction you don’t want it to go. Simply try to imagine building a house without an architecture plan. Now, that wouldn’t work at all. The same thing is happening here.


Data First

Each software system is actually about data – users input some data into the system, system processes it, and users get some output. I am actually a firm believer of the "data first" approach in the software architecture, and there are multiple reasons for that. All of the processes which our system needs to automatize and support are based upon some data – if there is no data, there is no need for processing it, as simple as that. Understanding the data model which is used in a solution, by my experience, leads to easier and better understanding of all the other segments of the solution architecture.

Even the slightest change in the data structure can lead to the major impact and huge changes in the whole system, which can go that far that the architecture, and respectively development, needs to be redone in a great degree. Furthermore, if the data is not properly modeled and structured, that can lead to replication of data, data structures, and functionality, together with the attendant costs of that duplication in development and maintenance.

I have already mentioned in the previous posts of this series that we will be developing different modules of the conference organization solution, which are based on difference technologies. But, one thing they will necessarily have in common: they will need to share the same data, and to use the same data structures. That means that the data which we model here will need to be pulled through different services, and it will be serialized and deserialized on numerous occasions.

Those are just some of the reasons, why do I believe that detailed data description and modeling should always be the first step in the software architecture.

Creating data model

Managing large quantities of structured and unstructured data is a primary function of information systems. Data models describe structured data for storage in data management systems such as relational databases. They typically do not describe unstructured data, such as word processing documents, email messages, pictures, digital audio, and video. Even if the conference organization solution will be based on SharePoint, and as such, can and will contain unstructured data, that data is considered to be supporting data, and it will not be considered here. Another aim which we are trying to achieve with this approach, is that we keep our data model independent of the actual data storage implementation – we could, hypothetically, change the solution architecture to use some other technology for the data storage, without changing the data model itself.

There are three different levels of data model, which we can create:

  • Conceptual schema describes the high-level data architecture – main entitles and relations between them. It is very often used by the business users and analysts to describe the data model.
  • Logical schema contains detailed data description – including all the entities and columns (if relation data models are used), detailed specification of the relations between them, and all the other attributes necessary to describe and define the data used in the solution. It serves as a basis for modeling business entities in the application afterwards.
  • Physical schema describes the actual structures in which the data is being stored (like database tables and fields), as well as physical means by which data are stored. This is concerned with partitions, CPUs, tablespaces, and the like.

The advantage of this approach is that it allows the three perspectives to be relatively independent of each other. Storage technology can change without affecting either the logical or the conceptual model. The data structure described in the Logical Schema can change without necessarily affecting the conceptual schema.

How to proceed

Creating the data model does not necessarily mean that we will immediately open the SharePoint Designer to create libraries and lists, or SQL Server Management Studio (if SQL Server is used for data storage) to create tables and fields. There is a tendency in SharePoint solution development to "kick off immediately", and start creating lists and libraries, so that users can immediately see the progress and upload some files. This is a good approach when you are creating a small, quick solution, which supports collaboration inside one team. But, when there is a need to create a complex, enterprise solution, based on SharePoint, there is a need for architecture as much as with any other platform. Making a quick kick-off can become a real obstacle, which can create a lot of troubles afterwards.

good_codersBasically, with SharePoint, as with any technology, we need to create a data model for our solution.

The best way is to proceed is to first sketch and describe the data model on the paper. You can actually use paper and pen (this is the easiest way to do it), or Microsoft Visio and Microsoft Word to create the conceptual and logical schema of the data structures that will be used.

This schema will, as we have said, be the cornerstone of our solution, and we will, in regular intervals, check all of the other architecture aspects, if they are in the spirit of that schema.

 

 

In the next post of this series, scheduled for 05.10.2011, a full conceptual and logical schema for Conference Organization Solution will be created and described.

Continue to:
Architecting Sharedove Conference Organization Solution: Data First (Part 2)Architecting Sharedove Conference Organization Solution: Data First (Part 3)