Could ImGUI be the future of GUIs?

2019-01-27

A random un-thought out idea that came up was could something like Dear ImGUI ever be the future for a mainstream UI library?

For those that don't know what an Immediate Mode GUI or ImGUI is there's a semi famous video by Casey Muratori about it from 2005ish.

Most programmers that use an ImGUI style find it infinitely easier to make UIs with them than traditional retained mode GUIs. They also find them significantly more performant.

The typical retained mode object oriented GUI framework is a system where you basically create a scenegraph of GUI framework widgets. (windows, grids, slider, buttons, checkboxes, etc). You copy your data into those widgets. You then wait for events or callbacks to get told when a widget was edited. You then query the widget's values and copy them back into your data.

This pattern is used in practically every GUI system out there. Windows, WFP, HTML DOM, Apple UIKit, Qt, you name it 99% of GUI frameworks are Retained Mode Object Oriented Scenegraph GUIs.

A few problems with this GUI style are

In contrast in an ImGUI there are no objects and there is almost no state. The simple explanation of most ImGUIs is that you call functions like

// draw a button
if (ImGUI::Button("Click Me")) {
  IWasClickedSoDoSomething();
}
// draw slider
ImGUI::SliderFloat("Speed:" &someInstance.speed, 0.0f, 100.0f);

Button and Slider do two things.

  1. They append into a vector (array) the positions and texture coordinates needed to draw the widget (or not insert them if they'd be clipped off screen or outside the current window / clip rectangle)

  2. They check the position of the mouse pointer, state of keyboard, etc to manipulate that widget. If the data changed they return it immediately

So, pluses:

Possible minus

Perceived but probably not minuses

I guess I'm really curious. I know most GUI framework authors are skeptical that ImGUIs are a good pattern. AFAICT though, no one has really tried. As mentioned above most ImGUIs are used for game development. It would take a concerted effort to try to find the right patterns to completely replicate something as fancy as say Apple's UIKit. Could it be done and stay performant? Would it lose the performance by adding back in all the features? Does the basic design of an ImGUI mean it would end up keeping the perf and the easy of use? Would we find certain features are just impossible to really implement without a scenegraph?

Let me also add that to some degree React is similar to an ImGUI in usage. React has JSX but it's just a shorthand for function calls. The biggest differences would be

If we were to translate the code above into some imaginary ImReact it might be something like

const Button = (props) => {
  return ImGUI:Button(props.caption);
};

const SliderFloat = (props) => {
  return ImGUI:SliderFloat(props.caption, props.value, props.min, props.max);
};

const Form = (props) => {
  if (<Button caption="Click Me">) {
    DoSomething();
  }
  <SliderFloat min="0" max="100" value="&props.speed" caption="Speed:" />
};

Just looking at that React code you can see the translation back into real code is really straight forward.

Not exactly sure how the update to speed would work but I guess I'm mixing C++ (ImGUI) with JavaScript (React). Typical ImGUIs either have the pattern of being able to pass in a pointer to a primitive, something JavaScript doesn't have. Or, they return the new value as in

newValue = ImGUI::SliderFloat(caption, currentValue, min, max);

which if you want to use the same as the Dear ImGUI C++ example you'd write

someInstance.speed = ImGUI::SliderFloat("Speed:", someInstance.speed, 0.0f, 100.0f);

So if we assumed that style of API then

const Button = (props) => {
  return ImGUI:Button(props.caption);
};

const SliderFloat = (props) => {
  return ImGUI:SliderFloat(props.caption, props.value, props.min, props.max);
};

const Form = (props) => {
  if (<Button caption="Click Me">) {
    DoSomething();
  }
  props.speed = (<SliderFloat min="0" max="100" value="{props.speed}" caption="Speed:" />);
};

Notice the components are not returning virtual dom nodes since there's no need. The only thing we're really taking is JSX just to show that you could use a React style pattern if you wanted to.

Note: Don't get caught up in the direct state manipulation in the example. How you update state should not be dicated by your UI library. You're free the manage state anyway you please regardless of which UI system you use. Still the example shows how simple ImGUI style is.

state.value = ImGUI:SliderFloat(caption, value, min, max);

is certainly simpler than

// at init time
const slider = new SliderWidget(caption, state.value, min, max);
slider.onChange = function(newValue) {
  state.value = newValue;
}

// if state.value changed slider needs to show the new value
function updateSlider(newValue) {
  state.value = newValue;
}

Even worse now you need to some how call updateSlider either everywhere state.value is updated or you need to write some elaborate system so that all places that want to update state.value call into a system that tracks all the widgets and what state they reflect.

ImGUI libraries needs no such complication. There is no widget. Every frame whatever value is in the state is what's in the widget. This is the same promise of React but React ends up being hobbled by the fact that it's on top of slow retained mode GUI libraries.

As an example of complexity possible the most prolific ImGUI is Unity's Editor UI.

So at least there is some precedence of using an ImGUI in user facing app instead of just a game even if Unity itself is for making games.

There are also lots of screenshots of various ImGUI made UIs in the readme.

Here is also a live version of the included example in the Dear ImGUI library

If you decide to interact with it be aware that it's not actually been designed for the browser and so has issues that need fixing. Those issues can easily be fixed so don't get bogged down in nitpicking tiny issues. Rather, notice how complex the UI is and yet it's running at 60fps. Use the "examples" menu in the main window and open more windows. Expand the examples in the main window and see all kinds of live and complex widgets. Now imagine you tried to make just as complex UI using HTML/DOM/React. Not only would the HTML/DOM version have lots of pauses and likely not run 60fps but the code to actually implement it would probably be 5x to 10x as much code along multiple dimensions. One dimension is how much code you have to write to implement the UI using HTML/DOM and/or React vs ImGUI. The other dimension is how much code executes to get the UI on the screen. I suspect the amount of CPU instructions executed in the HTML/DOM version is up to 100x more than the ImGUI version.

Consider the ImGUI::Button function vs making <button> element.

For the <button> element

  1. HTMLButtonElement object as to be created.

    It has all of these properties that need to be set to something

     autofocus: boolean 
     disabled: boolean 
     form: object 
     formAction: string 
     formEnctype: string 
     formMethod: string 
     formNoValidate: boolean 
     formTarget: string 
     name: string 
     type: string 
     value: string 
     willValidate: boolean 
     validity: object ValidityState
     validationMessage: string 
     labels: object NodeList
     title: string 
     lang: string 
     translate: boolean 
     dir: string 
     dataset: object DOMStringMap
     hidden: boolean 
     tabIndex: number 
     accessKey: string 
     draggable: boolean 
     spellcheck: boolean 
     autocapitalize: string 
     contentEditable: string 
     isContentEditable: boolean 
     inputMode: string 
     offsetParent: object 
     offsetTop: number 
     offsetLeft: number 
     offsetWidth: number 
     offsetHeight: number 
     style: object CSSStyleDeclaration
     namespaceURI: string 
     localName: string 
     tagName: string 
     id: string 
     classList: object DOMTokenList
     attributes: object NamedNodeMap
     scrollTop: number 
     scrollLeft: number 
     scrollWidth: number 
     scrollHeight: number 
     clientTop: number 
     clientLeft: number 
     clientWidth: number 
     clientHeight: number 
     attributeStyleMap: object StylePropertyMap
     previousElementSibling: object 
     nextElementSibling: object 
     children: object HTMLCollection
     firstElementChild: object 
     lastElementChild: object 
     childElementCount: number 
     nodeType: number 
     nodeName: string 
     baseURI: string 
     isConnected: boolean 
     ownerDocument: object HTMLDocument
     parentNode: object 
     parentElement: object 
     childNodes: object NodeList
     firstChild: object 
     lastChild: object 
     previousSibling: object 
     nextSibling: object 
     nodeValue: object 
     textContent: string 
    
  2. More objects need to be created.

    Looking above we can see we need to create

    NodeList            // an empty list of children of this button
    HTMLCollection      // another empty list of children of this button
    StylePropertyMap    //
    NameNodeMap         // the attributes
    DOMTokenList        // the CSS classes as a list
    CSSStyleDeclaration // an object used to deal with CSS
    DOMStringMap        // empty but used for dataset attributes
    ValidityState       // ?? no idea
    

This is just creation time so far. Tons of properties need to be set to defaults, filled out with empty strings and or other objects need to be created and those objects also need all their properties filled out and as well may need deeper objects created.

Now that an HTMLButtonElememt exists it get inserted into the DOM

At render time the browser will walk the DOM, I'm sure there is some amount of caching but it needs to figure out where the button is. It will likely build some separate internal scene graph separate from the DOM itself which is rendering specific so 1000s more lines of code get executed.

Eventually it will get the to point to render the button. Here again it has to check the 100s of CSS attributes. Text color? Font size? Font Family? Text Shadow? Transform? Animation? Border? Multiple Borders? Background color? Background Image? Background gradient? Is it transparent? Is it on its own stacking context? Literally 100s of options.

Let's assume it's using nothing special, eventually it will generate some quad vertices to render font glyphs. It will likely render these glyphs into a texture or grid of textures for the stacking context. It does this as an optimization so ideally if a different stacking context has its content change but nothing in this stack context changes it can skip re-rendering the texture(s) for this context and just use the one it created last time.

I'm sure there's a 100 other steps I missing related to caching positions, marking things as computed so they don't get recomputed, and on and on.

Compare to ImGUI:Button which is just a function, not an object. All it effectively does is

  1. Clip the button rectangle to the current clip space and exit if it's completely clipped
  2. Insert the vertices for the rectangle of the button into the pre-allocated vertex array
  3. Insert the vertices for each glyph stopping when the first glyph is clipped by the button area.
  4. Return true if the mouse button was pressed and if its position is inside button rectangle, else false.

That's it

Note that those 4 steps also exist in the browser in HTML/DOM land except they are 4 steps of 100s.

So, in summary, ImGUI style is potentially much faster and easier to use. It's both easier to use in the simple case and easier to use in the complex case. The API is easier to use. It's easier to reason about. There is no state. There are no objects. There is no data marshalling. There are no events or callbacks. Because it's so fast when the UI gets complex no giant frameworks like React's virtual dom need to be created. Because of the speed little to no effort is required to workaround slowness like with the DOM. More research into ImGUI style UIs could lead to huge gains in productivity.

Comments
When will we get secure desktop OSes?
A Bad rant on a bad rant on OpenGL ES