A complete open source IFC development stack

Getting started with programming for IFC has never been easier or cheaper.

I present here a stack of great tools that together allow you to build tools that can read, write, analyse, automate, store in a database, and do whatever you want with IFC data. The sky really is the limit. They are all:

  • Open source
  • Zero cost
  • Can be used in commercial and business environments
  • Cross platform
  • Don’t require elevated admin rights to install or run

The tools are:

  • Version control: GIT
  • Code editor: Visual Studio Code
  • Framework: .NET Core 2.2
  • Language: C#
  • IFC writing / reading toolkit: xBIM
  • Bundler: dotnetwarp

(Note: this is not a detailed, from scratch guide, there may be some degree of additional setup and configuration required depending on your exact system)

I highly recommend against creating your own IFC reader and writer unless you really need to, it is not a task for the faint hearted and you’ll need to absolutely sure that none of the existing solutions fit your requirements.

GIT

GIT provides the source code version control system. This tracks all changes and eventually allows you to upload to private and public code repositories, such as GitHub.

You can download and install the latest version from here: https://git-scm.com/

Visual Studio Code

Your Integrated Development Environment is really important, this is where you’ll spend the majority of your time coding and debugging the tools you write.

The Professional and Enterprise versions of the traditional Visual Studio tools are very expensive, hardcore tools, mainly aimed at full time software developers.

A faster and cheaper approach is to use Visual Studio Code. This is a really popular, lightweight Integrated Development Environment that you can install here: https://code.visualstudio.com/

If you use C# with Visual Studio Code, you should also install the C# tools extension from within Visual Studio Code.

.NET Core 2.2

This is Microsoft’s completely open source and free application framework. It contains all the tools needed to develop cross platform libraries, console apps, clouds app, and more.

It doesn’t contain full user interfaces like Winforms or WPF, but my suggested route if you require a UI is to build a command line app for the IFC logic and wrap it with a UI from another technology, such as Electron.

At the time of writing, the latest xBIM library only supports up to NET Core 2.2 so don’t install a version any higher than that. When you’re reading this, check yourself what the highest version xBIM supports.

To install without needing elevated admin rights use the Powershell script here: https://dotnet.microsoft.com/download/dotnet-core/scripts

C#

C# is an open source and open spec language from Microsoft. It is hugely popular with literally millions of users and a vast ecosystem.

It has a slightly higher learning curve than scripting languages like Python and JavaScript but it is well worth the effort.

It installs with .NET Core listed above, and there are good beginners tutorials here: https://docs.microsoft.com/en-us/dotnet/csharp/tutorials/intro-to-csharp/

xBIM

This is a software library that takes away the difficult parts of reading and writing IFC files and allows you to focus on the business logic you need to implement.

It’s developed by Northumbria University and has contributions from many other people around the world.

See the Github page for installation instructions: https://github.com/xBimTeam

Dotnetwarp

By default, a .NET Core app (with 2.2) will create a .exe and a large number of DLLs and linked libraries. This makes deployment of general desktop apps quite messy.

To get around this, you can use the dotnetwarp tools which bundles eveything into a single exe file for easy deployment and management.

Nuget installation: https://www.nuget.org/packages/dotnet-warp/

Alternatives

If C# and .NET is a bit too heavy then you can swap them for Python and swap xBIM for IfcOpenShell.

If they aren’t heavy enough, IfcOpenShell is also compatible with C++.

Summary

I hope this article is useful for people looking to do development and automation with IFC.

And if you’re already developing with IFC, please comment below to let me know what your stack looks like! What language, IDE, and IFC toolkits do you use?

Stop sharing native models

Everyone wants collaboration and sharing of information, and one of the first things people asked for on projects is for people to share their native models. Whether this be in Revit, Tekla, ArchiCAD or other.

I myself have been guilty of this.

It seems like a good idea at the time because you get the exact data that the other people are working with, you can often merge or link models a little more easily, and can make changes to their models if needed.

Dependency Hell

However, when you look at the big picture sharing and using native models has one very big downside: Dependency Hell! I highly recommend you read the linked Wikipedia page and I’m sure you’ll relate if you’ve ever worked on a project where everyone gets stuck on the same version of Revit.

Dependency Hell is enough of a problem that you should seriously consider whether sharing native models actually solves real problems and whether it’s worth it when you take into account all the downsides.

Connected, not coupled

We want people on a project to work together, to share information, and to be connected, but we don’t want them to be dependent on each other.

The tools that any particular stakeholder uses should have no to little effect on what anyone else uses. The tools shouldn’t matter to anyone except the people using them, it’s the output that is important. Unfortunately by sharing native models and specifying software we make tools matter.

By sharing native models we couple and tightly bound people together in such a way as to restrict the choices they can make. One party can’t use a new version or tool unless everyone changes too, which is often a very difficult and expensive process.

Use the best tool for the job

The over riding priority for any IT systems or software setup should be to allow people to use the best tool for the job they need to do.

This not only improves efficiency, but often makes for better employee well being. Nothing sucks more than knowing there’s a better tool out there but not being able to use it.

By mandating a certain tool or compatibility with a proprietary format people have to use what is specified, not what might be best. This is not good.

Increase competition

If you specify a particular piece of software you’re cutting out businesses that don’t use it and reducing your options.

As a double whammy, you’re also encouraging all businesses to use the same tools, potentially creating monopolies that inevitably stagnate and stop innovating.

Focus on IFC

Some projects specify many different types of formats, IFCs, native, and even 3D PDFs. Apart from the information coordination nightmare this creates, it means people don’t have a focus for their workflows.

Some people will create workflows for PDF, some for native, and some for IFC. This represents duplicated effort. If you only specify a free, platform-agnostic format like IFC, then everyone focuses around that format and the chances of people being able to share knowledge and share workflows increases significantly. This is an important way to add value to a project.

Summary

In summary, construction project teams should absolutely work together, but they need to be more aware of the big picture problems that come by aligning so tightly they are no longer free to make decisions that would ultimately be in a project’s best interest.

Getting Valid Input Data

As the construction industry becomes more data driven, its often the little things that break data integration processes. For example, you could try to set up a process that links data from a model to a construction sequence, the only way you could automatically do that would be by having some form of common value so one system knows what things to match from the other.

However, in both the modelling and sequencing software the fields will often be free manual text input. The nature of these two seperate disciplines often means that any discrepancies won’t be found until days or weeks after the incorrect values have been inputted. And the longer time has gone the more difficult it is to correct.

The nature of construction and designs means that there are a lot humans involved in designing, engineering, and creating things which need to be named and described in ways that will never be 100% predictable.

This creates little painful errors such as “0” (zero) being typed as “O” (letter). Spacing and seperation characters being “/” or “-” or “_” which are difficult to spot and can completely break even the most well put together digital systems.

Approaches

There are a number of ways to increase the likelyhood of correctly inputted and aligned data. In order of effectiveness:

  1. Restricting input to predefined values
  2. Not allowing input if the input does not pass validation rules
  3. Displaying warnings if input does not pass validation rules
  4. Performing automated validation
  5. Performing manual validation

The key to all of these is informing the person that is inputting data that there is a problem as quickly as possible.

The human factor

Think about whenever you’ve had to fill in a form online. We much prefer it when a field instantly highlights we’ve done something wrong, as opposed to filling everything in, clicking submit, and then it telling us we’ve typed something wrong or that a username is already taken.

I think this boils down to humans simply not liking going back to things they believe they’ve already finished. We like to complete a certain task and then move onto something new. To counteract this, we need to inform people of problems as fast as possible.

Implementation

In reality, the trouble with approaches 1, 2, and 3 is that it relies on individual tools to allow this level of input control, which is difficult to implement. Most input fields in BIM tools are simply free entry, and creating you’re own validation system for every tool is not feasibile.

Of course, if you are in a position to create your own apps or do bespoke development, then absolutely you should try 1, 2, and 3.

Number 5, manual validation, is often performed by tools such as Solibri or Excel. The major problem with these tools is the manual and time consuming nature of performing the validation steps. It’s often done by specialists, and by the time a model arrives for validation and has beem validated and had a report created for it, the original modeller or designer may have moved onto different areas and will be reluctant to go back and fix little data errors.

This is where number 4 comes in. The most scalable solution for getting better data is automated validation. As soon as a model is uploaded to a CDE there should be systems in place to download and validate this model, reporting back in near real time. The person responsible for that data should be informed as fast as possible.

This process can generate data quality dashboards, which are essential to shine light on how well aligned your systems and data are.

Blockers

The biggest blocker I see at the moment is the lack of an open, standard, and accepted data validation language for construction. MVDs go part of the way there, but aren’t granular enough to get the sort of validation we require.

We need both semantic structure validation rules (e.g. All doors should be in a wall) and allowable values rules (e.g. Levels named to the project standard). As another level, even automated design validation would be useful (e.g. No pipes more than 12m in length).

Proprietary systems like Solibri Model Checker aim to fill this gap, but cannot be a long term solution.

What the solution to this problem would like exactly, I don’t know, but I believe the problem of inconsistent and bad data is big enough to warrant thinking about.

Why aren’t Gantt charts dead yet?

Ubiquitous in construction for generations, yet no Gantt chart has ever been correct. Ever. So why do we still use them?

World War Gantt

The actual inventor of what we know as Gantt charts is difficult to determine because they have developed over time but Henry Gantt (1861-1916) was the first to bring them into popularity and hence are named after him.

They were used extensively in the First World War by the US, which is hardly an endeavour famous for its organisational and management efficiency.

Despite this they are seen as the default tool for planning construction activities. Its use is never questioned, but are we really going to say that in over 100 years the Gantt chart is still the best system of planning that we have?

Now, that is a scary thought…

Future fantasy

At best a Gantt chart is a work of fiction in so far as it is about imaginary events that have not happened (yet). At worst it is a lie because who ever is making the chart is well aware that events will not unfold exactly as they have described.

They are Minority Report-esque in their attempt to give precise dates and durations to tasks that are weeks, months, and even years in the future.

If we were to call them “Predictions of the Future” I imagine many would have less faith in them.

The future is a fog where the further ahead you look the more difficult it is to know what is there, but Gantt charts have no mechanism to show this. They make it seem like it is possible to plan a task that is due in a years time with as much accuracy as one that is due tomorrow.

Despite the fact no one can predict the future, a Gantt chart gives the illusion of predictability, control, and certainty. Perhaps the comfort that this illusion provides is what has sustained the usage of Gantt charts throughout the decades.

When the future is wrong

And what happens when leadership finds out that (surprise surprise) the future doesn’t turn out the way it was predicted and (dun dun duh!) the project is “behind schedule” (gasps and groans in equal measure)?

The project programme becomes a stick, a big deadly whacking stick.

A death march to the deadline begins. No celebrations are allowed. Hours get increased, costs spiral, the effect on everyone’s mental health is horrible. People get burned out and leave.

And this is just the accepted norm in construction.

In my opinion, Gantt Charts have a major role to play in this negativity.

Unpredictable

The construction industry hates unpredictability but our solution so far has been to create ever more detailed plans in an effort to get a hold of the greased pig that is building stuff.

Even with modern methods of construction and off site manufacturing, any construction project of significant size is inherently chaotic and unpredictable. But we refuse to acknowledge this and believe we can bend reality to fit our planning methodology.

I’ve even seen such daft uses for Gantt Charts as software implementation and digital transformations, as if either of those things actually have fix dates, fixed duration, and fixed scope.

So, what’s an alternative?

The first step is to just get people thinking that perhaps we can do better than basing our entire system of planning around a century old 2D chart format.

Next is to admit we can’t predict the future. The best we can do is to estimate, and the further into the future things are the less accurate we will be.

Then we can start to look and other ways of managing dates and dependancies that are agile enough to prevent change from destroying projects and people. They do exist in the form of Agile and Scrum methodologies that have matured in the software development world, we just need to open our mind to them.

Finally, stop focusing on the deadline, instead focus on productivity. Work on improving teams’ efficiency by making the work transparent, having weekly lessons learned rather than one every few months which gets forgotten about, and iterate on your processes frequently and gain feedback as soon as you can.

By improving the team you are much more likely to hit the deadlines in the first place.

 

Business benefits of openBIM

Communicating the business benefits of openBIM is difficult.

We’re up against armies of marketers from large corporations with multi-million pound budgets to lobby and appeal to the major industry players, as well as to spread Fear, Uncertainty, and Doubt about alternatives that could hurt their bottom line.

Unfortunately believing and following the big monopolistic companies is often the default safe bet for IT and digital departments, after all “Nobody gets fired for buying IBM“.

This post is my attempt to summarise the business benefits of adopting openBIM as a construction business’s underlying digital strategy. The benefits are split into the following three areas.

  1. Flexibility, choosing from many options
  2. Agility, being able to change
  3. Stability, keeping consistency

Flexibility

Sticking to solutions from just one company gives you a small box to work within, there are generally limited options and you will end up changing your process to fit the tool rather than the other way around.

Adopting openBIM doesn’t mean you can’t continue using your existing favourite software but it does increase your available options, and it allows you to always choose the best tool for the job.

If software suppliers know that you do not have any flexibility or little to no options, it gives them tremendous power to dictate cost and quality to you.

By adopting openBIM formats you are showing them that you have a wide choice of suppliers and they will have to work hard to get your money, driving competition and innovation.

Agility

Once you decide to use a single company’s platform or tool you are effectively locked in.

And you can do all the due diligence in the world and be 100% certain they have the best tools for the job at the time of deciding, but what happens if in 12 months time things aren’t as good as they first appeared and competitors with better solutions start to enter the market? Bad luck, you’re stuck with the initial decision and their proprietary solutions you can’t easily escape from. The cost of change would be very significant.

(My favourite example of this is the cutting edge UK Navy aircraft carrier powered by Windows XP. That project clearly had no room for agility and to change as new developments occured.)

However, if you adopt openBIM as your underlying strategy the cost of change is vastly decreased. All your data will be highly compatible with IFC so you will be able to easily open your existing data in new systems with minimal migration costs.

openBIM also reduces dependancies. You can upgrade your software to take advantages of new features without having to wait for others to also upgrade theirs (if they ever do).

The costs will never be zero because even little changes always cost money for businesses, but the cost will be reduced and the ability to react to technological advances can a key competitive advantage.

Stability

In this age of rapid technological change it would be a fool who tried to predict or plan how we’re going to be working in just a couple of years time.

So how does a company create a medium or long term digital strategy needed for investment decisions?

The answer is to not standardise around something as temporary and ephemeral as a tool or a platform, but to standardise around data structures. And openBIM provides the best data structures to do this with.

To be clear, we want tools and platforms to rapidly evolve and to help improve our efficiency, but the underlying data should be far more stable.

If the fundemental underlying data structures keep changing, say by changing file formats every year, then that can cause huge legacy and compatibility difficulties. For example, if you build a tool around a proprietary piece of software it can become a sunk cost which works against change. If your tools are based on underlying standard data they’ll have a greater life span and Return On Investment.

openBIM provides stable and comprehensive data structures you can base your construction data on.

With openBIM you can chop and change and overhaul your tools and processes frequently but maintain a stable foundation of a standard data structure.

Summary

If this blog post is too long for an elevator pitch to the CEO, here’s my attempt to boil the business benefits of openBIM down to a single snappy sentence:

Adopting openBIM gives a business the flexibility to choose the best tool for the job, the agility to change tools, and the stability to make long term decisions.

Streamlining IFC

With previous new IFC versions there has been much celebration on how many new entities have been added and how large and comprehensive IFC now is (an astonishing 776 classes and counting).

But if it isn’t already, there is a serious danger of IFC becoming bloated.

Rather than extending, I think now is the right time to be talking about streamlining and modernising the fundamentals of the standard to improve the ease of implementation and accessibility into the future.

Here are some ideas based on practical usage experience.

Remove all property sets

Currently IFC is a schema (entities and attributes) and property sets. I believe IFC would be much stronger if it was schema only.

While a schema can be universally adopted across every project in the world, the properties can change completely depending on client requirements and project specific processes.

I would remove all predefined property sets (like Pset_SanitaryTerminalTypeCommon) because compared to the schema, they add relatively little value.

This would be a good first step to slimming down and focusing on the IFC schema.

Properties are very important, but should be handled by an entirely separate standard, such as the buildingSMART Data Dictionary.

Remove the ‘Ifc’ prefix from all entities

A simple one, but it is incredible how much more readable and clear the standard becomes once the ‘Ifc’ prefix is removed. A wall should be a Wall, not an IfcWall.

Clean up the inheritance tree

(The following are based on IFC4add2, but the same applies to most versions)

Microsoft advises that having classes with more than 4 levels of inheritance is bad design and “can be difficult to follow, understand, and maintain”.

The IFC specification has 353 classes that are 5 or more levels deep, of those an incredible 117 are 9 levels deep!

Having that many layers of abstraction is incredibly excessive. Just because some entities share attributes doesn’t mean a parent class should be created at the expense of readability.

As a start, Relationships can be removed from the main Rooted tree. A relationship does not really need a name, description, or owner history. It is debatable whether they even need a GlobalId.

I also don’t think anyone could convincingly argue that nearly 50 different classes are required to meaningfully represent relationships and remain easily implementable.

Remove IfcPropertySets from the rooted tree too and then IfcObject and IfcTypeObject can be the new top level entities. This reduces the maximum inheritance to 7, which is still extremely high!

A single IfcPropertySet class

The property set concept of a list of name-value pairs is very simple and highly universal. There’s no need for all the predefined sets, including quantity sets, and templates. I’m certain a single IfcPropertySet class would be sufficient for 99% of use cases and also does not need to inherit from Root.

Remove Base64 GlobalIds

It is for purely historical storage reasons that IFC uses GUIDs converted to Base64 using a unique character mapping scheme. The amount of space that it saves in the modern IT environment is nothing.

GlobalIds should now be in the standard Base16 format so that more existing software libraries can read, process, and create them, rather than requiring a custom algorithm.

Scrap the ISO10303 support

Currently there is an XSD and a ISO10303 EXPRESS schema definitions. Rather than continue with such an outdated and unsupported format, the EXPRESS schema should be scrapped with the focus on the XSD schema definition that is far easier to develop against. I’m aware that there are some aspects that can’t be specified in XSD, but if thousands of other formats can be specified using just XSD, then I don’t believe IFC is so special as to be an exception.

Because the STEP file format has relatively very few tools supporting it, IFC in XML format and zipped should be the only format recognised by buildingSMART.

A zipped XML file is the standard approach for all Open Office and Microsoft Office files and should be more than sufficient.

Summary

The aim of this blog post  is to encourage people to consider streamlining the IFC standard to make it easier to use, understand, and implement by everyone.

These are just a few ideas to get things rolling, and I’ve not even mentioned issues like the complexity of interpreting IFC geometry and possible improvements to how it’s documented.

If you have any ideas please free to add them below.

The impossible problem of naming things

Is there any situation more socially awkward than not knowing someone’s name and having to describe them to someone else?

“Erm, you know that tall guy? Kinda slim? Talks fast? Short brown hair? In the engineering department?”

Even deciding what attributes to use to describe them is fraught with difficulties.

And in Building Information Modelling is there anything more contentious than how to name something?

No system ever seems to satisfy everyone’s requirements. The British Standard, Publicly Available Standard, and International Standard documents do a good job but they’re far from perfect. Whether it’s zone, document type, or classification system, there’s always some part of it that just doesn’t seem to work properly for users.

Files, library objects, layers, instances of objects, and even classification systems themselves suffer from this problem.

Why?

Why is it so difficult to name things?

Many decry “dinosaurs” who don’t want to change, or the standards bodies that don’t know about the real world, or simply poor implementation.

But I believe it is because of the unresolvable conflicts between human needs and information management needs.

Humans need names to be readable, simple, understandable, and easy to remember. For information management we need names to be easy to process, flexible, yet also rigidly structured.

Human needs

There’s a good reason why giving people names is a concept used in every human society, it allows us to make reference to people easily.

Sure, a person’s name doesn’t really tell us anything about that person, but given context at least we can work out who is being talked about.

Construction projects have many documents and we need a way to reference specific ones, so naturally we use “names”. However, we can’t just give them names like Steve or Roger, instead we try to inject some context and logic into them with names like AH-CHT-Z1-00-DR-M-55_11_20-1001.

Assuming we have memorised the standard, we can have a good guess about what this document contains, but it still leaves an awful lot to the imagination and is not very readable.

It is also a huge assumption that everyone would know what all the codes stand for, so it’s not very simple.

It’s not particularly memorable either,

Information needs

The concatenation problem

To create a contextual name we have to concatenate different information, and we have to be selective about which information to include, which means potentially useful information is excluded.

We also have to shorten down descriptions to simple codes, e.g. DR = Drawing, M2 = Model, ZZ / XX =???.

And then to actually process that name back into useful data it must be split up and inferred based on the position and short codes. It’s all very difficult to process.

The multiple problem

There’s the multiple problem, what happens when a document covers more than one zone, level, type, role, or classification? Ideally there should be a list of possible values, but we can’t put that in a concatenated name, certainly not without making it overly long. The workaround is to just class them as multiple or none applicable without specifying what the multiple even are. A name isn’t flexible enough to allow for these possibilities.

The string problem

There’s also the problem that a name is always just a string of characters. To enforce the structure of a string of characters is a very difficult thing to do. See the code needed just to validate email addresses, which doesn’t even take into account the content itself. In this respect, names are too flexible and not rigidly structured enough.

Are names necessary?

With all these problems, it brings us to an important point. In a digital, structured data environment do we even need names or naming conventions?

Rather than a name, uniqueness could be ensured using proper Globally Unique IDs.

All the current fields in a name could instead be a list of properties that, without the restriction of having to form a name, could eliminate the need for short codes and use much more readable full words.

For humans, we could have unlimited length descriptions, and even additional fields in any format appropriate.

This could avoid the concept of concatenating fields which is just a compromise between humans and information and makes data dumber.

Conclusion

Let’s forget about names. A name should be anything anyone wants it to be. If we need more context then look at the attached raw data. Because after all, the data is where the focus should be.