Coordinate Systems in the World of Holograms
Post date: 08 March, 2016
HoloLens is a revolutionary device which I believe will change how we understand and interact with the world in five years. This article has the following goals:
- To introduce the basic challenges when displaying Holograms
- To understand the positioning systems of Holograms
- To understand different reference frames used in the positioning systems
- To encourage developers to create HoloApps
Coordinate system gives a unique identification to each the position in space. For example, the screen adopts the Cartesian coordinate system and each point is described by x- and y-coordinate pairs, say (1,4), (3,4), (6,4), and (6,4). A unit of x- or y- coordinate is a pixel on the screen so the area of the rectangle is 6*3=18 pixel sq. The position of the origin (0,0) for a system depends on the context and the application. This origin is usually easy to locate in the space, say the top-leftmost pixel on the screen or the center point on the screen. Given the coordinate system of the display device, we can precisely draw a rectangle on the screen at position (10,10). Moreover, we can also derive geometric relationships among the points of an object, such as distance and volume.
Similarly, we can use a Cartesian coordinate system to position an 3D object in the physical world using three perpendicular axes: an X, Y, and Z axis. This is the spatial coordinate system in HoloLens. It is a right-handed coordinate system (see Figure 1). The positive X-axis points right, the positive Y-axis points up (aligned to gravity) and the positive Z-axis points backwards. 1 unit distance in the coordinate system means 1 meters apart in real-world. The spatial coordinate system is rigid and has its real-world spatial meaning. Using this scale, you can design a virtual object which will be rendered in the real-world by HoloLens.
Figure 1. Right-handed coordinate system. Image modified from http://viz.aset.psu.edu/gho/sem_notes/3d_fundamentals/gifs/left_right_hand.gif
A coordinate system needs a frame of reference. We need a reference point, that is the position of the origin. The reference point is provided by the system and other points are relative to it. But where is the reference point? First of all, it should be easy to locate and not ambiguous. Besides, since an user moves with the HoloLens, the reference point has to be relatively stable. Otherwise, the Holograms are drifted following the movement of the reference point. This special point in the real-world will be spotted by the HoloLens using the light information collected by the sensors. HoloLens will create a stationary frame of reference, especially to ensure the positions of objects near the user as stable as possible relative to the world. This coordinate system will be maintained throughout the app's lifetime.
The HoloLens is an intelligent device that continuously sense and understand the world. The knowledge is incomplete and part of it is approximation, say the actual distance in real-world. For example, the device may currently believe two locations in the world to be 5 meters apart, and then later with the new sensing information, it realizes the actual distance is only 4 meters. If those holograms had initially been placed 5 meters apart in a single rigid coordinate system (I.e. the spatial coordinate system), one of them would then always appear 1 meters off from the real-world. As a result, if the users move around a large area, the holograms may drift off their original position. This is inconsistent with the spatial memory of user and annoys the user. The device has to tracked the whole space and adjusted the positions smoothly following the movements.
To balance the computational workload and the accuracy of the positioning, HoloLens introduces the concept of spatial anchor, which represents an important point in the real-world that the system should keep track of over time. It has its own coordinate system. Besides, its coordinate will be adjusted over time relative to other spatial anchors or frames of reference. The spatial anchor's coordinate system provides a more stable position for the hologram at a given time. The small adjustments over time is required to reposition the holograms in the relative to the world. By placing a hologram in the coordinate system of a nearby spatial anchor, this hologram maintains optimal stability. (Is unclear that if a hologram can use multiple spatial anchors to position or if one anchor is sufficient?)
This continuous adjustment of spatial anchors relative to one another is the key difference between coordinate systems from spatial anchors and stationary frames of reference. Holograms placed next to the user in the stationary frame of reference appear stable while the Holograms attached to a spatial anchor may drift relatively to other spatial anchors.
Attached frame of reference is a reference frame relative to the user, I.e. the head of the user can be regarded as the origin in the reference frame. Holograms placed in an attached frame of reference will move with the user. In other words, you bring these Holograms with you. Content rendered in these holograms are called 'body-locked' content. An attached frame of reference has a fixed orientation, defined when it's first created. It does not rotate even when the the head or body of the user turns. This is a (spatial) fallback when the device cannot figure out where it is in the world.
Lastly, head-locked content stays at a fixed spot in the display, such as a Head-up display. It lacks depth information. Head-locked content is often uncomfortable for users and does not feel like a natural part of their world.
In summary, the coordination system is the basis to positioning Holograms. It utilizes the spatial coordinate system to represent the positions in real-world. Since it is a mixed-reality device, it has to learn and understand the real-world gradually so as to maintain stable Holograms. A virtual coordinate system is constructed with the help of the spatial anchors. Holograms that can be displayed independent of the real environment through attached frame of reference. As a last resort, head-locked content shows the information on screen.