Abstract
Direct volume rendering (DVR) enables the visual exploration of regions of interest (ROI) in a volume in an interactive manner. Mixed reality head mounted devices (MR-HMDs) have the potential to expand the utility of DVR. MR-HMDs enable not only the high-quality and stereoscopic DVR but also its in-situ visualizations of complementary data to assist in the real-world tasks of users. The state-of-the-art MR-HMD works, however, mainly focus on investigating the feasibility and effectiveness of DVR to target MR applications. The interactive ROI-based exploration of DVR has been severely limited in prior research, and user optimization is still confined to the primitive image manipulation of pre-configured ROIs (e.g., view rotation). In this work, we propose an interactive and semi-automated exploration approach for ROI-based DVRs in MR-HMDs by introducing a new MR-HMD-based rendering parameters (i.e., transfer functions (TFs)) optimization and utilizing the built-in biometric interfaces. Using eye-gazing and hand gestures, users can select ROIs within an initial DVR visually and interactively. Our MR-TF approach (named hereafter) then automatically optimizes the corresponding TF toward ensuring the visibility of ROIs. Additionally, our MR-TF approach addresses a critical rendering issue in MR-HMDs, i.e., the color perception of ROIs should be consistent regardless of the variations of intermixing physical backgrounds. Our MR-TF approach quantifies the color perception and integrates to the TF optimization for enhanced ROIs. We achieve on-device implementation of our MR-TF approach on the commercial MR-HMD with low computing capability, Microsoft HoloLens 2. Our evaluations with different types of volumes demonstrate the capabilities and practicality of our MR-TF approach.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Direct volume rendering (DVR) is one of the fundamental algorithms for the three-dimensional (3D) visualization of volumetric data and has been shown to be effective across a wide range of data modalities and application domains (Kaufman et al. 1994; Preim and Bartz 2007; Fuchs and Hauser 2009; Zhang et al. 2011; Gao et al. 2022; Heyd and Birmanns 2009). A core feature of DVR lies in its ability to project all regions (voxels) within a volume onto a display, thereby facilitating the simultaneous visualization of multiple regions. More importantly, DVR offers region of interest (ROI)-based exploration. ROIs can vary across volumes and even within the same volume according to usage scenarios. DVR enables a user to manipulate (emphasize) ROIs in an interactive manner.
The recent advancements in mixed reality (MR) technologies have the potential to expand the utility of DVR. MR head-mounted devices (MR-HMDs) offer glass-based, high-fidelity displays with stereoscopic vision, allowing for the full 3D visual perception of ROIs. More importantly, MR-HMDs provide the in-situ visualization of complementary DVRs in the physical workspace where a user conducts their tasks, by seamlessly overlaying onto their field of view (Wong et al. 2023; Lee et al. 2021). There are now affordable commercial MR-HMDs, such as Microsoft HoloLens 2 and Apple Vision Pro, capable of overcoming on-device computing requirements for DVR generation (Jung et al. 2022; Hrycak et al. 2024). Building on these capabilities, MR-HMDs not only improve spatial awareness of complex volumetric data but also enhance task efficiency by enabling intuitive, real-time interaction with DVRs directly within their physical workspace. Investigators have begun to explore the feasibility of utilizing DVR across diverse usage scenarios within the domain of MR (Zhang et al. 2022; Macedo et al. 2014; Wieczorek et al. 2010; Leuze et al. 2018). Their primary focus was on implementing and deploying DVR itself to particular MR applications. Here, the ROI-based feature of DVR was heavily restricted. ROIs should be pre-configured per in-situ visualization, and it is rarely achievable to switch ROIs to be rendered and update their visual appearance in an online MR-HMD. User interactions were confined to primitive manipulations including rotation, panning, and scaling of DVRs (Allison et al. 2020; Cheng et al. 2023; Pooryousef et al. 2023).
Transfer functions (TF) play an indispensable role in interactive ROI-based DVR. Given a TF widget, a user is required to associate the ROIs of a volume with rendering parameters for visual emphasis. In the widely-established TF design approaches, they have to specify the intensity range to represent ROIs, followed by assigning opacity and color values to each intensity range (Ljung et al. 2016; Kindlmann 2002; Wu and Qu 2007). As such, TF designs inherently involve a task of non-intuitive and iterative adjustment in TF parameter spaces. Such a labor-intensive nature poses great challenges when they are directly applied to MR-HMDs. MR-HMDs commonly rely on biometric interfaces, such as eye-gazing, hand gestures, and voice commands. In-air inputs through hand gestures, for instance, may lack the requisite sensitivity for the fine-grain TF adjustments. This implies that obtaining desired DVR results in MR-HMDs via the use of traditional TF design approaches can be time-consuming or sometimes unachievable. Furthermore, MR-HMD applications potentially demand a minimum interaction nature; for instance, during surgical interventions, clinicians perform a set of procedures on the patient's body using their biometric interfaces (i.e., hands) while simultaneously requiring interaction with DVRs of pre-operative patient data to access the surgical guidance (Wong et al. 2023; Wang et al. 2021). Such constraints raise the immediate need to devise new TF design approaches that allow for ROI-based DVR with minimal MR interactions and enhance the applicability of DVR in MR-HMDs.
In this work, we propose a new MR-HMD TF design (MR-TF) approach that aims to enable intuitive and semi-automated ROI-based DVR interactions in challenging MR-HMDs. The main strategy of our work is to introduce a visibility-based TF parameter optimization algorithm (Correa and Ma 2009, 2010; Jung et al. 2013, 2016) to MR-HMDs. Visibility is a metric that quantifies how much each voxel (or a ROI as a group) in a volume is discernible to the user by computing its opacity contribution to the resulting DVR. This metric has been widely used as a loss function in TF parameter optimizations; an initial TF is iteratively fine-tunned to prioritize the visibility of user-selected ROIs by identifying occluding regions and adjusting their TF parameters (Correa and Ma 2009). The conventional visibility metric, however, cannot sufficiently represent the characteristics of MR-HMDs. In MR-HMDs, the DVR is projected onto semi-transparent glass displays, and the user observes it superimposed on physical backgrounds. The conventional visibility metric considers only the volume itself, without accounting for the influence of the underlying physical backgrounds. The color discrepancies arising from diverse physical backgrounds can affect the visual perception of optimized ROIs (Hincapié-Ramos et al. 2014; Zhang et al. 2021). A technical objective outlined upon developing our MR-TF approach is to improve the conventional visibility computation. Our MR-TF approach ensure the consistent visual perception of ROIs in MR-HMDs by taking into account the color contrast with physical backgrounds during the visibility computation of ROIs, i.e., TF adjustments are more pronounced when ROIs are superimposed on physical backgrounds that are similar in color. Another technical objective was to allow for the user selection of ROIs to be within image spaces (i.e. DVRs), not TF parameter spaces, to make it more simple and intuitive. Our MR-TF approach leverages biometric interfaces commonly built into commercial MR-HMDs. By combining eye-gazing and hand gestures, the user selection can be performed in a visual, stereoscopic, and so 3D-recognizable manner, in particular for initial DVRs where there appear other overlapping regions. Additionally, our MR-TF does not rely on remote computing and achieves on-device implementation on the commercial MR-HMD, such as Microsoft HoloLens 2. We demonstrate the capability and applicability of our MR-TF approach by experimenting with various volume datasets and MR-HMD usage cases. The contributions of our work are summarized as:
-
To the best of our knowledge, our work is the first to focus on devising an intuitive and semi-automated TF design approach dedicated to interactive ROI-based DVR in MR-HMDs by introducing the visibility-based TF parameter optimization algorithm (Correa and Ma 2010; Jung et al. 2013; Ma and Entezari 2017) and utilizing the built-in biometric interfaces,
-
Our work ensures the consistent visual perception of ROIs against superimposed physical backgrounds (colors) in MR-HMDs by integrating the concept of color contrast into the conventional visibility computation,
-
Our work allows for the intuitive and precise user selection of ROIs in MR-HMDs by leveraging a set of built-in biometric interfaces, and
-
Our work accomplishes the on-device computing requirements for MR-HMD applications by demonstrating our MR-TF approach on the commercial MR-HMD, Microsoft HoloLens 2.
2 Related work
2.1 TF design approaches for interactive ROI-based DVR
The established intensity-based one-dimensional (1D) TF has been advanced toward multi-dimensional TFs to improve ROI-centric DVR interaction. Kindlmann and Durkin (1998) investigated two-dimensional (2D) TFs using intensity with its first- or second-order derivative. These additionally derived features were projected as a secondary dimension on TF widgets, with which a user can better identify gradient information along the intensity (representing ROIs) and interactively emphasize their boundaries. Other features, such as size, occlusion, and statistical values (Max 1995; Cai et al. 1995; Caban and Rheingans 2008; Correa and Ma 2008), were considered as a secondary dimension to improve the interactive identification of ROIs. Some investigators used more than two dimensions, e.g., 20 local texture features by Caban and Rheingans (2008), to allow for greater differentiation among ROIs. The increase in TF dimensions, however, introduced additional complexity in the manual TF optimization by users. MR-HMDs usually tend to be with intuitive but coarse biometric interfaces and may not be the ideal environment for multi-dimensional TFs to be applied.
Image-based TF design approaches, as an alternative, allow for intuitively identifying ROIs through interactive gestures on DVRs such as strokes and painting (Ropinski et al. 2008; Guo et al. 2011; Jung et al. 2018). An approach introduced by Ropinski et al. (2008) enabled the selection of ROIs by outlining their silhouettes with strokes, which were then generated as distinct components in TF widgets. Users were required to refine and combine the components to adjust the visibility of ROIs. Guo et al. (2011) proposed an approach to interacting with the appearance of ROIs with high-level painting tools, e.g., peeling, contrast adjustments, and erasers. Jung et al. (2018) extended image-based TF designs from volumes to four-dimensional (4D) data (time + volume). These approaches, however, were not always able to correspond intended ROIs with gesture inputs, particularly on DVRs where ROIs were semi-transparent and/or multi-layered. They sometimes involve extensive user interactions, and may not be ideal to apply for DVRs with MR-HMDs.
Some investigators focused on parameter optimization algorithms with the aim to reducing user interactions in TF designs (Correa and Ma 2009; Ropinski et al. 2008; Guo et al. 2011; An et al. 2023). Devising an effective cost function plays a pivotal role in the performance of TF parameter optimization approaches. A pioneering, dominant cost function was visibility proposed by Correa and Ma (2009). The novelty and utility of visibility have been further investigated by several investigators (Ruiz et al. 2011; Bordoloi and Shen 2005; Jung et al. 2016, 2013; Ma and Entezari 2017). Ruiz et al. (2011) improved the performance of the TF parameter optimization process by making it converge more accurately and rapidly. They leveraged the Kullback–Leibler distance measure, instead of Jensen-Shannon in Bordoloi and Shen (2005) to refine the matching between the target visibility distribution of ROIs by users and the resultant visibility counterparts from the TF parameter optimization process. Jung et al. (2016) improved the computation burden of visibility by leveraging the parallel architecture of GPUs. Another work by Jung et al. (2013) extended the visibility-based TF parameter optimization to dual-modality volumes, where the visibility of ROIs in one modality was enhanced by keeping its TF fixed but optimizing the TF for the other modality. Recently, Ma and Entezari (2017) enhanced sensitivity in the visibility computation by assigning weights to voxels according to their locations, i.e., higher weights were given to voxels in the boundaries of ROIs compared to the interior. All these approaches were developed for conventional DVR environments with opaque displays so could not sufficiently address the semi-transparent nature of MR-HMDs where the visualizations of ROIs are intermixed with physical backgrounds.
2.2 DVR applications in MR-HMDs
Jung et al. (2022) explored the feasibility of adopting DVRs in the latest commercial MR-HMD, Microsoft HoloLens 2, to process and visualize volume datasets with different resolutions. The results suggested that the current generation of MR-HMDs was viable for the computationally demanding DVR. Along with the findings, a growing number of investigators are exploring the utility of DVR in diverse MR usage scenarios. Macedo et al. (2014) applied DVR to on-patient MR visualization using medical volumes. Here, the head was augmented with computed tomography (CT) volume of the patient for the precise localization of ROIs in the physical anatomy. Zhang et al. (2022) developed an MR-based digital imaging and communications in medicine (DICOM) DVR viewer. The MR work by Pooryousef et al. (2023) suggested a forensic autopsy using DVR. The conducted user study found that the proposed system could be a valid option for reproducible digital autopsies. Although valuable, all these works focused on implementing DVR to specific MR usage scenarios. It seems that technical challenges when adopting DVR for MR-HMDs were not considered in those works. There were only a few works that aimed to improve DVR to better adapt it to MR-HMD environments. The work by Cheng et al. (2023) introduced physical environment-synced illumination to enhance the visual quality of DVRs in MR-HMDs. We observed the lack of work to enhance effective DVR interactions (i.e., TF design) in MR-HMDs, which motivates us to propose our intuitive and semi-automated MR-TF approaches.
3 Methods
3.1 Overview
We illustrate the overview of our intuitive and semi-automated MR-TF approach for interactive ROI-based DVR in MR-HMDs in Fig. 1 using a tooth CT volume. An initial intensity-based 1D TF consisted of three tent peaks to visualize the dentin, vessels, and crown, and the vessels, as an ROI, were faintly visible in the initial DVR (see Fig. 1a). A user intuitively determined a point of interest using the three built-in biometric interfaces of MR-HMDs (see Fig. 1b). The combined use of these interfaces allowed for the stereoscopic and 3D-recognizable selection of the ROI in the multi-layered DVRs. Our MR-TF approach leveraged an ROI segmentation algorithm such as region growing (Monga 1987) to obtain the whole ROI, where the user-determined point was considered as a seed. The subsequent iterative TF parameter optimization process began with the initial TF and ROI semantic label. Each iteration generated an intermediate TF, computed the visibility of the ROI, and measured if it matched the target value user-assigned with voice commands (see Fig. 1c). Here, we weighted the visibility of the ROI according to the color contrast with the physical backgrounds to be intermixed in MR-HMDs. This can enhance the consistent color perception of the ROI despite variations in the physical backgrounds. It ended up finishing when it converged to an optimal TF solution that prioritized the ROI as well as attenuating the occluding dentin and crown regions (see Fig. 1d).
3.2 Simple and intuitive user selection of ROIs using built-in biometric interfaces of MR-HMDs and ROI segmentation algorithm
We obtained a 3D intersection point between a plane generated from eye-gazing and the pointing vector of the finger, as shown in Fig. 2a. The 3D point was highlight-ed with a yellow sphere for validation using voice commands. Using a control panel, the user could manually adjust the 3D position in 5-mm units along each axis in cas-es where precise placement is required. The gazing plane of the eyes was generated by casting, along the y-axis, a line that connects the convergence point of both eyes and the center point of the MR-HMD.
We leveraged the region growing algorithm (Monga 1987) where the intersection point was used as the seed input. Using the intensity value of the seed voxel, it iteratively propagated adjacent voxels to obtain a whole ROI. The iteration was complete when neighboring voxels no longer fell within a threshold range. The threshold range was level-selected via voice command to allow a user to decide the desired ROI. Level 1 meant that the threshold range was ± 30 on an intensity value of the seed voxel, which increased by 30 steps per level up to the maximum level of 10. We chose the region growing algorithm due to its inherent advantages of simplicity in implementation and robustness to noise (Justice et al. 1997). Another key advantage was its minimal computational demands to facilitate the on-device implementation of our MR-TF approach on low-spec commercial MR-HMDs. However, note that our MR-TF approach was not limited to this particular segmentation algorithm and could adopt advanced segmentation algorithms with reliance on remote computing.
3.3 Visibility-based TF parameter optimization
We optimized an initial TF iteratively toward minimizing a loss function, \(F\), that consisted of the visibility tolerance for the user-selected ROI, \({F}_{T},\) and the tolerance of TF movements, \({F}_{M}:\)
where \(\theta \) was a set of opacity parameters of the initial TF and \(\omega \) was the weight for the visibility tolerance. The visibility tolerance was used to control how much the ROI was observable in the optimized TF (i.e., the resultant DVR) and was calculated as the square error between the user-assigned target visibility, \({V}_{T}\), and the computed visibility, \({V}_{C}\) (see Eq. 2). The target visibility was an intuitive hyperparameter that could be expressed as a multiple of the initial visibility of the ROI through voice commands. The details for the visibility computation are described in the following sub-section.
The tolerance of TF movements was introduced to control the minimum \({\theta }_{n}^{min}\) and maximum \({\theta }_{n}^{max}\) of the opacity parameters, as shown below:
where \(N\) is the total number of the TF parameters, and \({[x]}_{+}\) is a clamping operator, i.e., \(x\)\(x>0\) or 0 otherwise. It would be beneficial for the situations where other regions were not unintentionally made transparent, i.e., they remained somewhat visible to offer context to ROIs.
We employed a widely established intensity-based 1D TF which mapped the intensity range representing ROIs (voxels) to opacity parameters to be optimized. An initial TF was formulated as a set of components having a tent shape (Jung et al. 2018). We optimized the peak of all tent-shaped components, and the bottom side remained unchanged (see Fig. 1). The tent shape was intentionally chosen because it could depict the iso-surface of the corresponding ROIs and allow the final TF to semi-transparently exhibit multiple ROIs, which was desirable for complex ROI-based DVR. Moreover, it could be easily parameterized by any TF parameter optimization algorithms. Nonetheless, our MR-TF approach was not limited to this particular shape and could be re-parameterized into other shapes depending on specific visualization and data needs (Itti et al. 1998).
We leveraged the (Nelder and Mead 1965) for our loss function. It was effective for solving our nonlinear TF parameter optimization problem where derivatives were difficult to be explicitly computerized. It, furthermore, tended to be robust to initialization variations during optimization convergence (Lagarias et al. 1998) and could mitigate concerns of falling into local minima (Manousopoulos and Michalopoulos 2009) that TF parameter optimization processes frequently encountered. It effectively explored optimization parameter spaces through the utilization of a geometric structure known as simplex. Initial TFs could be parameterized by associating each opacity of tent peaks with the vertices of the simplex. It iterated converging the simplex parameter space through the four operations such as reflection, expansion, contraction, and shrink.
3.4 Visibility computation of ROIs weighted according to color contrast to physical backgrounds of MR-HMDs
The visibility computation of a voxel at a coordinate position \(p\), \(T(p)\), was defined as the accumulation of the opacity contributions of all the voxels from the eye position \(E\) to \(p\) in Fig. 3a (Correa and Ma 2009, 2010; Jung et al. 2013):
where \(A\left(p\right)\) is the composited opacity of the voxel, and \(O(p)\) is its opacity defined by a TF. The visibility of an ROI, \(V\), was expressed as the weighted sum of the voxels belonging to the ROI:
where \(X\) is the semantic label of the ROI, and \(W(p)\) is a weight function in terms of the color contrast of the voxel according to the physical backgrounds of MR-HMDs. For the color contrast-based weight function, we computed the HSV (hue, saturation, and value) difference between each voxel within an ROI and the average value from the corresponding region of the physical background, which underwent a natural logarithmic function and was then normalized to a range between 0.5 and 1.5:
where \(H\left({p}_{i}\right)\) is the value of each channel of HSV for the voxel and \({h}_{i}\) is the average value of the corresponding channels of the physical background. As shown in Fig. 3b, a lower weight value meant that the voxel was more similar to the physical background in color difference and contributed less to the visibility computation (i.e., TF parameter optimization). To satisfy the target visibility, a more aggressive change in TF parameters was necessary to make the voxel more observable against a similar physical background. We chose HSV color space to quantify color difference because it has been demonstrated to align more closely with the human visual system by accounting for brightness variations that are not available in the RGB (red, green, blue) counterpart (Ajmal et al. 2018). Note that our MR-TF approach was not restricted to this particular color space and could adopt others if necessary. CIE-Lab could be an alternative that offers a more perceptually uniform color distribution, but, its adoption may introduce additional computational overhead to MR-HMDs due to its increased complexity.
4 Results
4.1 Experiments
We experimented with our intuitive and semi-automated MR-TF approach based on the established commercial MR-HMD, Microsoft HoloLens 2 with the Qualcomm Snapdragon 850 processor, 4 GB DRAM, 3904 × 2196 resolution semi-transparent display, and biometric interfaces of eye-gazing, hand gestures, and voice commands. Our MR-TF approach was implemented based on the Unity development platform using the mixed reality toolkit (MRTK) library (Microsoft 2022) with C# and DirectX shading languages. We used volume rendering engine (Voreen) for the display purpose of TF optimization results (Meyer-Spradow et al. 2009). The 2D visualization results, with each dimension ranging from 25 to 30 cm, were positioned 1 m away from the MR-HMD to ensure proper alignment within the user's field of view. We also note that approximately 1 m is the distance at which a human can effectively perceive 3D depth and thus perform MR interactions (Joo and Lee 2021; Mirbagheri and Chau 2024). The target visibility (\({V}_{T}\)) and threshold level of ROIs were defined accordingly for visualization and data needs. The value of weight (\(\omega \)) was consistently assigned 0.95 to place a much higher optimization priority on visibility tolerance (\({F}_{T}\)). The minimum and maximum values of TF movements (\({F}_{M}\)) were set to 99% to allow all full tolerance. We obtained the images of physical backgrounds by using the built-in camera of the HoloLens 2 with adaptive white balance and exposure functions.
We evaluated the various capabilities of our MR-TF approach to generate ROI-based DVRs in MR-HMDs using seven volume datasets. They were obtained from established open-access archives, including OsiriX (Pixmeo 2024), Voreen (Meyer-Spradow et al. 2009), and Open Scientific Visualization Dataset (Klacansky 2017). They have been used for the verification of diverse TF and DVR approaches (Correa and Ma 2010; Jung et al. 2013; An et al. 2023), thus supporting our evaluation in terms of experimental replicability and reliability.
We also assessed the usability of our MR-TF approach by conducting a user study. Our MR-TF approach was compared with a conventional counterpart that required a user to manually adjust the 1D TF using a MR-HMD widget. We utilized the system usability scale (SUS) (Brooke 1996) that has been demonstrated to be reliable and valid across numerous user studies (Lewis 2018). The SUS questionnaire comprises 10 items, with each rated on a five-point Likert scale. A total of seven participants, who had experience with DVR and TF, were recruited. They exhibited varying levels of familiarity with a MR-HMD, ranging from novices with no prior experience to those who were fairly accustomed to using the technology. They were initially shown an instructional video, after which they were asked to generate two representative cases (see Figs. 4 and 8) using each of the two TF approaches. Upon completing each task, they filled out the SUS questionnaire and provided informal feedback.
Optimization results from our MR-TF approach according to the three variations of target visibility (\({V}_{T}\)) using an engine CT volume. The first row indicates DVRs, the second is the close-up view of the ROI and its visibility map, where the brighter regions exhibit higher visibilities, and the third is TFs
4.2 Our intuitive and semi-automated MR-TF approach for interactive ROI-based DVRs in MR-HMD
We experiment with how our MR-TF approach optimizes an initial TF according to three different values of target visibility (\({V}_{T}\)) values in Fig. 4. An engine CT volume was visualized with the initial TF emphasizing multiple regions of the exterior chassis, interior structures, and cylinders. The front cylinder as an ROI was initially occluded by the other regions (see Fig. 4a). A user intuitively determined the ROI through the eye-gazing and hand gesture biometric interfaces of the MR-HMD (see the yellow circles), which was then segmented with the region growing algorithm. The values of \({V}_{T}\) were inputted with voice commands. In Fig. 4b with \({V}_{T}\) = 1.6, the visualization of the ROI became somewhat more apparent than the initial counterpart, but the occluding regions remained still obvious. These visual optimizations were made by our MR-TF approach that automatically decreased the opacity parameters representing the occluding regions according to \({V}_{T}\) (see the first and second tent peaks in the TF). In Fig. 4c, with the higher value (\({V}_{T}\) = 1.8), their opacity parameters went down toward the minimum (see the first tent peak). When \({V}_{T}\) was increased to 2.0 (see Fig. 4d), they became almost transparent by assigning the lower opacity values, and additionally the opacity parameters of the ROI (see the third tent peak) were made to be higher to put the most emphasis on the ROI.
In Fig. 5, we show how our MR-TF approach optimizes three different ROIs in an atomic nucleus simulation volume. The initial TF was set to semi-transparently depict the nucleus, innermost (K) shell, and intermediate (L) shell, and then optimized to match \({V}_{T}\) to 2.0 for the ROIs. In Fig. 5a, our MR-TF approach made the user-specified ROI of the nucleus visually apparent (see the fourth tent peak in the optimized TF) while keeping the other regions the same (see the first to third tent peaks). Our MR-TF approach resulted in consistent optimizations for the other two ROIs, i.e., the innermost shell ROI in Fig. 5b and the intermediate shell ROI in Fig. 5c became the visually dominant regions.
In Fig. 6, we conduct an experiment to see how our MR-TF approach optimizes the same ROI according to the three viewpoint variations. An initial TF visualized the skin, lungs, muscles, kidneys, aorta, and bones of the abdominal CT volume. Here, the user-defined aorta ROI was obscured by different regions depending on the viewpoints. In the front view with \({V}_{T}\) = 1.1 (Fig. 6a), our MR-TF approach identified and de-emphasized the strong occluder of the skin and muscles (see the first to third tent peaks in the optimized TF) while highlighting the ROI (see the fourth tent peak). A new occluder of kidneys was introduced in both side view with \({V}_{T}\) = 1.4 (Fig. 6b) and the back view with \({V}_{T}\) = 1.7 (Fig. 6c). Consistent with the front view, our MR-TF approach succeeded in attenuating all occluders to ensure the visibility of the ROI.
In Fig. 7, we show how our MR-TF approach converges from the three different initial TFs. A human foot CT volume was used, and the initial TFs varied in the visualization of the soft tissues and muscles as well as the toe bone ROI. The same value of \({V}_{T}\) was applied to all initial TFs. In Fig. 7a where all regions were given the same opacity values, our MR-TF approach optimized the initial TF toward highlighting the ROI (see the third peak in the optimized TF) while attenuating the other occluding regions (see the first and second peaks). Even for the other two initial TFs that put more emphasis on either the muscles (see Fig. 7b) or the soft tissues (see Fig. 7c), our MR-TF approach tended to consistently optimize the toe bone ROI.
We show how our MR-TF approach optimizes the same ROI according to the color contrast to the physical backgrounds of semi-transparent MR-HMDs in Fig. 8. For the experiment, we used three synthetic images with unique colors as the physical background variations. An initial TF was defined to depict the vessels, crown, and dentin in a tooth CT volume, and the DVRs were intermixed with the three different background images, where the visual clarity of the reddish vessel ROI varied. The same value of \({V}_{T}\) was applied to all background images. With the blue background image (Fig. 8a) where the color difference in HSV space to the vessel ROI was substantial, our MR-TF approach ensured \({V}_{T}\) = 1.6 for the ROI by moderately increasing its opacity parameter (see the first tent peak in the optimized TF). A greater increase in its opacity parameter had to be made in the green background image (Fig. 8b) which exhibited less color difference from the vessel ROI, in order to satisfy the same \({V}_{T}\). In Fig. 8c, the dark-red background was the most similar to the color of the ROI, which resulted in not only the increase in its opacity parameter but also substantial decreases in the other two opacity parameters (the crown and dentin regions) to achieve its consistent visual emphasis.
In Fig. 9, we experimented with our MR-TF approach using two real-world backgrounds that exhibit varying textures or complex physical objects. We made TF optimization comparison with and without the application of our color contrast-based weight function. The same \({V}_{T}\) value of 1.6 was applied to all TF optimizations. In the case of the tooth volume, the application of our color contrast-based weight function led to the red vessel ROI being more prominent (see Fig. 9b vs c). This was achieved by greatly increasing the opacity parameters of the vessel ROI to ensure its \({V}_{T}\) against the physical background of a similar color. The consistent TF optimization result was shown from the engine volume, where the yellow cylinder ROI and surrounding blue exterior chasis were intermixed with the green leaves in flower pot. With our color contrast-based weight function (see Fig. 9b), the opacity parameter of the surrounding exterior chassis was more drastically decreased. This adjustment was not achievable without our color contrast-based weight function (see Fig. 9c).
In Fig. 10, we show how our approach can be applied to multi-modal volumes. For the PET (positron emission tomography)-CT volumes of the human abdomen (Fig. 10a), there was an initial TF pair: a CT TF to depict the skin, lungs, muscles, and bones and another PET TF to emphasize the abdominal lymphoma as an ROI. Our MR-TF approach optimized the multi-modal DVR in a manner that processed the CT TF while keeping the PET counterpart unchanged. Initially, the lymphoma ROI in the PET volume was faintly visible. Our approach identified and de-emphasized the occluding regions of the skin, lungs, and muscles (the first to third tent peaks) in the CT volume to ensure \({V}_{T}\) = 1.2 for the PET ROI. Another volume pair of MRI (magnetic resonance imaging) and PET for the human brain was applied in Fig. 10b. An initial MRI TF was set to visualize skin, cerebrum, and corpus callosum, and the PET TF counterpart was to a brain tumor as an ROI. Our approach optimized the MRI TF toward ensuring \({V}_{T}\) = 1.1 for the brain tumor ROI of the PET volume. After our optimization, the brain tumor ROI was made visually apparent by attenuating the occluding skin and cerebrum of the MRI volume (see the first and second tent peaks).
In Table 1, we show the computation complexity (time) results of our MR-TF approach in selecting (segmenting) ROIs and optimizing TFs. It is observed that our MR-TF could operate at 14 s on average prior to code optimization. As expected, the computational complexity of our TF parameter optimization was much greater than our ROI segmentation. Our TF parameter optimization was generally proportional to the resolutions of the volumes in terms of the computational complexity. However, it did not scale directly to the number of volumes (see Paranomix of a single-modality volume vs Petcetix of multi-modality volumes). Instead, spatial relations among ROIs and neighborhoods may have a greater impact on the computation complexity. We suggest that our MR-TF approach is operable in an online mode, if the physical memory of MR-HMDs is sufficient to load target volumes.
4.3 SUS scores and informal feedback from the user study
The SUS score comparison results between our semi-automated MR-TF approach and a manual counterpart (Choi et al. 2024) are shown in Fig. 11. Our MR-TF approach achieved an average SUS score of 83.2, corresponding to the second-best grade of “A” on the scale proposed by Lewis and Sauro (2018). In contrast, the manual approach obtained a notably lower SUS score of 54.2, which corresponds to the second-worst grade of “D”. Most of the participants reported an initial learning curve when interacting with our MR-TF approach; however, they found the learning process considerably more straightforward and intuitive when compared to the manual counterpart. The ability of our MR-TF to directly select ROIs using the built-in biometric interfaces and to automatically optimize TFs based on the user-defined hyperparameter was highly valued for its usability. In contrast, the manual counterpart inherently involved a task of non-intuitive and iterative fine-grained adjustment in a TF parameter space (a widget), which was found to be more difficult to perform in the MR-HMD environment. Additionally, some participants noted that the inclusion of voice commands enabled hands-free operation and improved accessibility.
The box plot diagram of the SUS score comparison results between our semi-automated MR-TF approach and a manual counterpart (Choi et al. 2024)
5 Discussion and future works
Our results demonstrated the capabilities of our MR-TF approach to enable (i) the intuitive and 3D-recognizable user selection of ROIs in MR-HMDs (see Fig. 5); (ii) the automated visual optimization of ROIs according to the user-defined visibility (see Fig. 4); (iii) the consistent color perception of ROIs regardless of the physical backgrounds to be intermixed (see Fig. 8); and (iv) on-device implementation on the commercial MR-HMD, Microsoft HoloLens 2, for interactive ROI-based DVR (see Table 1). Our results with different types of volumes and multi-modal datasets (see Fig. 10) suggested that our approach achieved a certain level of adaptability for its use in various DVR scenarios of MR-HMDs. The findings from the user study, furthermore, confirmed the high level of usability of our MR-TF approach (see Fig. 11).
We performed the user selection of ROIs in MR-HMDs by utilizing the built-in biometric interfaces. The precise selection of ROIs within initial DVRs is not a trivial task because ROIs may be fuzzy, multi-layered, and occluded by other regions of a volume (see Figs. 5 and 10). In this circumstance, a user is required to accurately correspond the intended ROIs with the input interaction, in our case, the point selection. We enabled this correspondence through the use of eye-gazing and hand gestures. They were visual, stereoscopic, and 3D-recognizable, thus allowing a user to identify ROIs in an intuitive and precise manner.
We chose the adoption of TF parameter optimization algorithms (Correa and Ma 2009, 2010; Jung et al. 2013) for the automated visual emphasis of ROIs in MR-HMDs, although there are other types of algorithms in the DVR and TF research community (Kindlmann 2002; Wu and Qu 2007; Lundstrom et al. 2006). The choice was attributed to our observation that MR-HMDs relied on intuitive but simple biometric interfaces and we should steer clear of sophisticated user interactions. Additionally, for practical DVR usage scenarios such as medical volume visualization (Preim and Bartz 2007; Zhang et al. 2011), there exist static TFs that have been carefully pre-defined, but they cannot be applied to all scenarios, and additional optimizations must be involved to some extent for new volumes and/or visualization needs. Our work aimed at enabling such requirements. Our results suggest that our MR-TF approach, i.e., the automated fine-tuning of a given initial TF, could be a viable TF design solution for MR-HMDs (see Fig. 7).
Our MR-TF approach addressed an important color distortion issue in MR-HMDs by quantifying color distortion and integrating it into the TF parameter optimization process. DVRs are projected on the semi-transparent display and intermixed upon the physical backgrounds. Users are inevitably subject to the visual distortion of ROIs according to the color variations of the physical backgrounds. The issue has so far been little investigated in the TF research community. Our MR-TF approach quantified the color contrast between the ROIs and physical backgrounds in HSV space that properly reflects the human visual system and applied the resultant weights to the TF parameters of ROIs. This allowed for optimized ROIs to maintain consistent color perception while improving their distinction to the physical backgrounds. Our results suggest that our MR-TF approach could be an automated and effective solution to address the issue (see Figs. 8 and 9).
Concerns frequently encountered when adopting parameter optimization algorithms lie with the sensitivity of hyperparameters and their impact on optimization results. In our TF parameter optimization, the hyperparameters included initial TFs, the type of ROIs, and their target visibility as well as viewing directions. Our results showed that our MR-TF approach achieved a certain level of robustness to the variations in the hyperparameters and ensured reliable ROI-based DVRs (see Figs. 4, 5, 6, 7). We believe that such robustness was, in part, attributed to the use of the visibility metric (Correa and Ma 2009) as the loss function of our TF parameter optimization. The hyperparameters indicate a high level of relevance to the spatial relationship between ROIs and other regions of a volume. The visibility metric could compute the spatial relationship and so robustly optimize an initial TF to emphasize ROIs while attenuating other occluding regions.
Our MR-TF approach limited the TF parameters to be optimized to opacity (i.e. the tent peaks in the TF parameter space). We decided not to include another parameter type, intensity (i.e., the bottom side of the tents). Instead, we initialized the intensity parameter type using TF presets and maintained it unchanged during the TF parameter optimization process. Our decision was based on the findings that the opacity parameter type held a far more direct relevance when optimizing the visibility of ROIs, which is the aim of our MR-TF approach, and traditionally, it has been regarded as the major burden in manual TF optimization (Correa and Ma 2010; Wang et al. 2011; Jung et al. 2013). Another practical reason for the exclusion stemmed from our application to MR-HMDs with low computing capability. The increasing number of TF parameters led to a substantial rise in the computation time. Our preliminary experiments showed that the inclusion of both parameter types (i.e. two times greater in the number of TF parameters) resulted in the computation cost two times longer. As such, although our MR-TF approach was capable of processing all TF parameter types, our current experimental setting used the opacity parameter type only. If new MR-HMDs with greater computing powers become available on the market or if we choose to rely on remote servers for offloading the heavy computation parts of our MR-TF approach, we would integrate both types of TF parameters.
Our MR-TF approach was designed to optimize voxels (ROIs) in the original volume space to ensure the homogeneous visibility of ROIs when intermixed with physical backgrounds. In contrast, a commonly used approach for MR-HMDs optimized the visibility of projected ROIs (Fukiage et al. 2014; Weiland et al. 2009; Sridharan et al. 2013) in a 2D image result. This approach is mainly effective in instances such as natural images, where an opaque ROI is projected, and there is no overlap with other regions. DVR, however, is an algorithm that visualizes volumes by accumulating semi-transparent ROIs and other regions along a view-point. Consequently, a single pixel in a 2D visualization result may exhibit heterogeneity across different region types, e.g., in Fig. 8, the yellow sphere contains the vessel ROI, crown, and dentin. In such circumstances, the conventional approach inherently computes the visibility of projected multiple regions collectively against physical backgrounds, which can compromise precise optimization and potentially result in undesirable visibility outcomes for ROIs. Instead, our MR-TF approach is capable of measuring the influence of physical backgrounds on individual regions, thus allowing for enhanced separation (i.e., homogeneous optimization) of ROIs in terms of visibility.
We valued our MR-TF approach as a framework to enable interactive ROI-based DVRs in MR-HMDs. Thus, we validated our MR-TF approach under the generic settings that are commonly used in a wide range of DVR applications and have a small number of variables impacting DVR experiments. We note that our MR-TF approach is not bound by the current settings and can adopt new advances to overcome their inherent limitations. For instance, we employed established intensity-based 1D TFs as the parameter spaces for optimizations in our MR-TF approach. As expected, our MR-TF approach was subject to the known limitation of the 1D TFs in which it is challenging to differentiate ROIs from the regions that overlap in intensity ranges (e.g., the internal and external shells in Fig. 5). The adoption of multi-dimensional TFs would allow for more elaborate TF parameter optimizations. Similarly, we utilized the region growing algorithm (Monga 1987) to select and segment ROIs. This segmentation algorithm is versatile, computationally efficient, and so appropriate to apply for MR-HMDs. Although effective in our experiments, it may not be the best performer for all cases. We could adopt segmentation algorithms that are disease-/modality-specific, as well as recent advances in deep neural networks for enhanced segmentation accuracy. Another component we can improve is the way to measure the color contrast between the ROIs and physical backgrounds. We observed a multitude of advances in the research community focusing on human visual perception. They can be easily integrated into our MR-TF approach. We consider the adoption of these new advances as an important extension of our current work. Future investigations will focus on strategies to mitigate the additional computational burden from their implementations in low-spec MR-HMDs.
In real-world MR-HMD applications, the user is typically in motion to some extent, which leads to changes in physical backgrounds. Consequently, our MR-TF may require continuous re-execution to account for these dynamic changes. Our MR-TF approach was designed with the assumption of a one-off execution in constrained scenarios. This was based on the premise that unless the user’s movement is substantial, the physical background would remain largely unchanged, and the re-execution would not be required. Otherwise, the user would need to wait an average of 14 s for each movement (although it was measured prior to code optimization as in Table 1), which may lead to an unsatisfactory user experience. As an interesting future work, we aim to extend the applicability of our MR-TF approach by addressing a wider range of scenarios involving drastic user movements. We plan to implement a mechanism that detects visual changes in the physical background and triggers our MR-TF approach dynamically when such changes are noticeable. Additionally, we can integrate foveated rendering techniques (Mohanto et al. 2022) into our MR-TF approach. With foveated rendering, we can improve the execution performance itself by prioritizing high-quality visualization in the regions where the user is gazing (e.g., ROIs) while reducing details in peripheral areas. Another interesting future work is to assess and compare our MR-TF approach with different types of MR-HMD displays. We experimented with optical see-through display of Microsoft HoloLens 2. Here, the user sees the physical background directly through the display optics, thus providing more natural but subjective intermixing with a virtual element (i.e., a visualization). An alternative is visual see-through displays, such as those used in Meta Quest 3 and Apple Vision Pro, which allows for more controlled intermixing of virtual and physical elements. It would be valuable to investigate how the visual perception of optimized ROIs differs between the two types of the MR-HMD displays. Furthermore, image-based interactions (Guo et al. 2011) could complement our framework. Similar to operations in 2D painting applications, such as Abode Photoshop, a wide range of interactions, including erasing regions, adding silhouettes, peeling region layers, and modifying contrast and brightness, can be directly applied to DVR images. These DVR interactions align well with the biometric interfaces of MR-HMDs, and we believe that our MR-TF approach can become more intuitive and goal-oriented and be equipped with enhanced usability.
6 Conclusions
This work allowed for interactive ROI-based DVR in MR-HMDs by introducing a new MR-TF approach and utilizing the built-in biometric interfaces. Our results suggest that intended ROIs could be intuitively user-selected and automatically optimized to ensure the visual emphasis of ROIs and their consistent color perception regardless of the physical backgrounds to be intermixed in MR-HMDs. Extensive experimentations with various volume datasets on the commercial MR-HMD, Microsoft HoloLens 2, validated the utility and practicality of our MR-TF approach. We envision that these overall combined capabilities will contribute substantially to the widespread integration of DVR in a diverse range of commercial, medical, and scientific MR-HMD applications.
Data availability
No datasets were generated or analysed during the current study.
References
Ajmal A, Hollitt C, Frean M & Al-Sahaf H (2018) A comparison of RGB and HSV colour spaces for visual attention models. In: 2018 international conference on image and vision computing New Zealand (IVCNZ)
Allison B, Ye X & Janan F (2020) Mixr: a standard architecture for medical image analysis in augmented and mixed reality. In: 2020 IEEE international conference on artificial intelligence and virtual reality (AIVR),
An H, Kim J, Sheng B, Li P, Jung Y (2023) A transfer function optimization using visual saliency for region of interest-based direct volume rendering. Displays 80:102531
Bordoloi UD, Shen H-W (2005) View selection for volume rendering. VIS 05. IEEE Visualization, 2005
Brooke J (1996) SUS-A quick and dirty usability scale. Usability Eval Indus 189(194):4–7
Caban JJ, Rheingans P (2008) Texture-based transfer functions for direct volume rendering. IEEE Trans Visual Comput Graphics 14(6):1364–1371
Cai W, Chen T, Shi J (1995) Rendering of surface and volume details in volume data. Comput Graph Forum 14:421
Cheng H, Xu C, Chen X, Chen Z, Wang J Zhao L (2023) Realistic volume rendering with environment-synced illumination in mixed reality. In: 2023 IEEE international symposium on mixed and augmented reality adjunct (ISMAR-Adjunct)
Choi J, Kim H, An H, Jung Y (2024) Real-time transfer function editor for direct volume rendering in mixed reality. In: SIGGRAPH Asia 2024 posters (pp. 1–2).
Correa CD, Ma K-L (2009) Visibility-driven transfer functions. In: 2009 IEEE pacific visualization symposium
Correa C, Ma K-L (2008) Size-based transfer functions: a new volume exploration technique. IEEE Trans Visual Comput Graphics 14(6):1380–1387
Correa CD, Ma K-L (2010) Visibility histograms and visibility-driven transfer functions. IEEE Trans Visual Comput Graphics 17(2):192–204
Fuchs R, Hauser H (2009) Visualization of multi‐variate scientific data. Comput Graph Forum
Fukiage T, Oishi T, Ikeuchi K (2014) Visibility-based blending for real-time applications. In: 2014 IEEE international symposium on mixed and augmented reality (ISMAR)
Gao Y, Chang C, Yu X, Pang P, Xiong N, Huang C (2022) A VR-based volumetric medical image segmentation and visualization system with natural human interaction. Virtual Reality 26(2):415–424
Guo H, Mao N, Yuan X (2011) Wysiwyg (what you see is what you get) volume visualization. IEEE Trans Visual Comput Graphics 17(12):2106–2114
Heyd J, Birmanns S (2009) Immersive structural biology: a new approach to hybrid modeling of macromolecular assemblies. Virtual Reality 13:245–255
Hincapié-Ramos JD, Ivanchuk L, Sridharan SK, Irani P (2014) SmartColor: real-time color correction and contrast for optical see-through head-mounted displays. In: 2014 IEEE international symposium on mixed and augmented reality (ISMAR)
Hrycak C, Lewakis D, Krüger J (2024) Investigating the apple vision pro spatial computing platform for GPU-based volume visualization. In: 2024 IEEE visualization and visual analytics (VIS)
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259
Joo SJ, Lee JH (2021) Three-dimensional depth perception in augmented reality. Korean J Cognit Biol Psychol 33(3):121–131
Jung Y, Kim J, Eberl S, Fulham M, Feng DD (2013) Visibility-driven PET-CT visualisation with region of interest (ROI) segmentation. Vis Comput 29:805–815
Jung Y, Kim J, Kumar A, Feng DD, Fulham M (2016) Efficient visibility-driven medical image visualisation via adaptive binned visibility histogram. Comput Med Imaging Graph 51:40–49
Jung Y, Kim J, Kumar A, Feng DD, Fulham M (2018) Feature of interest‐based direct volume rendering using contextual saliency‐driven ray profile analysis. Comput Graph Forum
Jung H, Jung Y, Kim J (2022) Understanding the capabilities of the hololens 1 and 2 in a mixed reality environment for direct volume rendering with a ray-casting algorithm. In: 2022 IEEE conference on virtual reality and 3D user interfaces abstracts and workshops (VRW)
Justice RK, Stokely EM, Strobel JS, Ideker RE, Smith WM (1997). Medical image segmentation using 3D seeded region growing. Medical Imaging 1997 Image Process
Kaufman AE, Yagel R, Hoehne KH, Pommert A (1994) Volume visualization. Vis Biomed Comput
Kindlmann G, Durkin JW (1998) Semi-automatic generation of transfer functions for direct volume rendering. In: Proceedings of the 1998 IEEE symposium on volume visualization
Kindlmann G (2002) Transfer functions in direct volume rendering: design, interface, interaction. Cour Notes ACM SIGGRAPH 3
Klacansky (2017) Open scientific visualization datasets. Retrieved Aug 5 2024 from https://klacansky.com/open-scivis-datasets/
Lagarias JC, Reeds JA, Wright MH, Wright PE (1998) Convergence properties of the Nelder–Mead simplex method in low dimensions. SIAM J Optim 9(1):112–147
Lee S, Jung H, Lee E, Jung Y, & Kim ST (2021). A preliminary work: mixed reality-integrated computer-aided surgical navigation system for paranasal sinus surgery using Microsoft HoloLens 2. Comput Graph Int Conf
Leuze C, Yang G, Hargreaves B, Daniel B, McNab JA (2018) Mixed-reality guidance for brain stimulation treatment of depression. In: 2018 IEEE international symposium on mixed and augmented reality adjunct (ISMAR-Adjunct)
Lewis JR (2018) The system usability scale: past, present, and future. Int J Hum Comput Interact 34(7):577–590
Lewis JR, Sauro J (2018) Item benchmarks for the system usability scale. J Usability Stud 13(3)
Ljung P, Krüger J, Groller E, Hadwiger M, Hansen CD, Ynnerman A (2016) State of the art in transfer functions for direct volume rendering. Comput Graph Forum
Lundstrom C, Ljung P, Ynnerman A (2006) Local histograms for design of transfer functions in direct volume rendering. IEEE Trans Visual Comput Graph 12(6):1570–1579
Ma B, Entezari A (2017) Volumetric feature-based classification and visibility analysis for transfer function design. IEEE Trans Visual Comput Graphics 24(12):3253–3267
Macedo MC, Apolinário AL, Souza AC, Giraldi GA (2014) A semi-automatic markerless augmented reality approach for on-patient volumetric medical data visualization. In: 2014 XVI symposium on virtual and augmented reality
Manousopoulos P, Michalopoulos M (2009) Comparison of non-linear optimization algorithms for yield curve estimation. Eur J Oper Res 192(2):594–602
Max N (1995) Optical models for direct volume rendering. IEEE Trans Visual Comput Graph 1(2):99–108
Meyer-Spradow J, Ropinski T, Mensmann J, Hinrichs K (2009) Voreen: a rapid-prototyping environment for ray-casting-based volume visualizations. IEEE Comput Graphics Appl 29(6):6–13
Microsoft (2022) MixedRealityToolkit-Unity. Retrieved Aug 5 2024 from https://github.com/microsoft/MixedRealityToolkit-Unity
Mirbagheri M, Chau T (2024) Optimising virtual object position for efficient eye-gaze interaction in Hololens2. Comput Methods Biomech Biomed Eng Imaging vis 12(1):2337765
Mohanto B, Islam AT, Gobbetti E, Staadt O (2022) An integrative view of foveated rendering. Comput Graph 102:474–501
Monga O (1987) An optimal region growing algorithm for image segmentation. Int J Pattern Recognit Artif Intell 1(03n04):351–375
Nelder JA, Mead R (1965) A simplex method for function minimization. Comput J 7(4):308–313
Pixmeo (2024) Osirix DICOM image library. Retrieved Aug 5 2024 from https://www.osirix-viewer.com/resources/dicom-image-library/
Pooryousef V, Cordeil M, Besançon L, Hurter C, Dwyer T & Bassed R (2023) Working with forensic practitioners to understand the opportunities and challenges for mixed-reality digital autopsy. In: Proceedings of the 2023 CHI conference on human factors in computing systems
Preim B, Bartz D (2007) Visualization in medicine: theory, algorithms, and applications. Elsevier
Ropinski T, Praßni J-S, Steinicke F, Hinrichs KH (2008) Stroke-based transfer function design. VG/PBG@ SIGGRAPH
Ruiz M, Bardera A, Boada I, Viola I, Feixas M, Sbert M (2011) Automatic transfer functions based on informational divergence. IEEE Trans Visual Comput Graphics 17(12):1932–1941
Sridharan SK, Hincapié-Ramos JD, Flatla DR, Irani P (2013) Color correction for optical see-through displays using display color profiles. In: Proceedings of the 19th ACM symposium on virtual reality software and technology
Wang P, Bai X, Billinghurst M, Zhang S, Zhang X, Wang S, He W, Yan Y, Ji H (2021) AR/MR remote collaboration on physical tasks: a review. Robot Comput Integr Manuf 72:102071
Wang Y, Zhang J, Chen W, Zhang H, Chi X (2011) Efficient opacity specification based on feature visibilities in direct volume rendering. Comput Graph Forum
Weiland, C., Braun, A.-K., Heiden, W. (2009). Colorimetric and photometric compensation for optical see-through displays. Universal Access in Human-Computer Interaction. Intelligent and Ubiquitous Interaction Environments: 5th International Conference, UAHCI (2009) Held as Part of HCI International 2009, San Diego, CA, USA, July 19–24, 2009. Proceedings, Part II:5
Wieczorek M, Aichert A, Kutter O, Bichlmeier C, Landes J, Heining SM, Euler E, Navab N (2010) GPU-accelerated rendering for medical augmented reality in minimally-invasive procedures. Bildverarbeitung Für Die Medizin 574:102–106
Wong KC, Sun EY, Wong IOL, Kumta SM (2023) Mixed reality improves 3D visualization and spatial awareness of bone tumors for surgical planning in orthopaedic oncology: a proof of concept study. Orthopedic Res Rev 139–149
Wu Y, Qu H (2007) Interactive transfer function design based on editing direct volume rendered images. IEEE Trans Visual Comput Graphics 13(5):1027–1040
Zhang Q, Eagleson R, Peters TM (2011) Volume visualization: a technical overview with a focus on medical applications. J Digit Imaging 24:640–664
Zhang Y, Wang R, Peng Y, Hua W, Bao H (2021) Color contrast enhanced rendering for optical see-through head-mounted displays. IEEE Trans Visual Comput Graphics 28(12):4490–4502
Zhang M, Liu W, Weibel N, Schulze JP (2022) A directx-based dicom viewer for multi-user surgical planning in augmented reality. Int Sympos Vis Comput
Acknowledgements
This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (RS-2025-00554526), the Culture, Sports and Tourism R&D Program through the Korea Creative Content Agency grant funded by the Ministry of Culture, Sports and Tourism in 2023 (Project Name: Cultural Technology Specialist Training and Project for Metaverse Game, Project Number: RS-2023-00227648), and the Gachon University research fund of 2021 (GCU-202110030001).
Author information
Authors and Affiliations
Contributions
Conceptualization: MJ, YH; Develop and Experimentation: MJ; Data analysis: MJ, SH, YH; Writing: MJ, YH; Editing: SH, YH; All authors reviewed the manuscript. We confirm that the order of authors has been approved by all named authors.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary file 1 (MP4 279568 KB)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Song, M., Kim, S. & Jung, Y. An intuitive and semi-automated transfer function design for interactive region of interest-based direct volume rendering in mixed reality head mounted devices. Virtual Reality 29, 53 (2025). https://doi.org/10.1007/s10055-025-01121-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10055-025-01121-4