Hi,
I'm currently working on an ARKit project where I need to implement object occlusion on devices that do not have a LiDAR sensor (e.g., iPhone XR, iPhone 11).
I used CoreML models like DepthAnythingV2 to create depth maps and DETRResnet50SemanticSegmentationF16P8 to to perform real-time segmentation. But these models are too heavy for devices.
Much appreciated on any advice or pointers to resources.
ARKit
RSS for tagIntegrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.
Posts under ARKit tag
200 Posts
Sort by:
Post
Replies
Boosts
Views
Activity
I am able to play a video on a plane in ARKit. But, I need to remove its background, presumably using a chromaKey technique. how do I do that? I have spent more time than is reasonable searching the docs and the internet, but to no avail. ChatGPT useless of course.
Surely it is doable; even the react native library for ARKit has it.
I'm seeking detailed information about the rotation matrix of the iPhone's front-facing (selfie) camera when using ARKit.
Specifically, I need to understand:
The exact rotation matrix applied to the front-facing camera's output in ARKit.
Whether this matrix is consistent across all iPhone models or if there are variations.
If there are any transformations applied to align the camera's coordinate system with the device's orientation, particularly in portrait mode.
How this rotation matrix relates to the transform property of `ARFrame.camera
Hi, I’m working on a portfolio project for Vision Pro these days.
I have two projects and each of projects are made with Unity and made with Xcode(using ARKit and RealityKit tracking feature). Is it able to combine these two projects in an app?
For example, using the buttons made with SwiftUI in a Reality Composer Pro, jumping to a scene in Unity, and back from a scene in Unity to a scene in Reality Composer Pro in an app.
if you check the code here,
https://developer.apple.com/documentation/compositorservices/interacting-with-virtual-content-blended-with-passthrough
var body: some Scene {
ImmersiveSpace(id: Self.id) {
CompositorLayer(configuration: ContentStageConfiguration()) { layerRenderer in
let pathCollection: PathCollection
do {
pathCollection = try PathCollection(layerRenderer: layerRenderer)
} catch {
fatalError("Failed to create path collection \(error)")
}
let tintRenderer: TintRenderer
do {
tintRenderer = try TintRenderer(layerRenderer: layerRenderer)
} catch {
fatalError("Failed to create tint renderer \(error)")
}
Task(priority: .high) { @RendererActor in
Task { @MainActor in
appModel.pathCollection = pathCollection
appModel.tintRenderer = tintRenderer
}
let renderer = try await Renderer(layerRenderer,
appModel,
pathCollection,
tintRenderer)
try await renderer.renderLoop()
Task { @MainActor in
appModel.pathCollection = nil
appModel.tintRenderer = nil
}
}
layerRenderer.onSpatialEvent = {
pathCollection.addEvents(eventCollection: $0)
}
}
}
.immersionStyle(selection: .constant(appModel.immersionStyle), in: .mixed, .full)
.upperLimbVisibility(appModel.upperLimbVisibility)
the only way it's dealing with the error is fatalError.
And don't think I can throw anything or return anything else?
Is there a way I can gracefully handle this and show a message box in UI?
I was hoping I could somehow trigger a failure and have https://developer.apple.com/documentation/swiftui/openimmersivespaceaction return fail.
but couldn't find a nice way to do so.
Let me know if you have ideas.
Hi everyone,
I'm looking for some guidance on how to create an .arobject file. Is there a way to generate it from a 3D model (like a .objcap or .usdz or .fbx/.obj file)? Or can it only be generated by scanning a real-world object using the ARKitScanner project? Any advice or resources on this would be greatly appreciated!
I've only found this project for scanning real-world objects.
Thanks in advance!
Hello Apple Team,
Is it possible to change the zoom factor, exposure, white balance and other settings, of an iOS ARKit session?
I know how to do it using an AVCaptureSession,
however, I can't figure out how to access the AVCaptureDeviceInput of the current AR session.
Thanks
PS: I'm using ARkit and RealityKit on iOS 17
Hi All,
Is it possible to record a video using the Object Capture instead of taking a series of pictures ?
Is it possible to get the bounding box coordinates of the object we capture ?
Hi everyone,
I’m working on an app for VisionOS that needs to recognize individual rooms in a hallway based on the person the room belongs to (using the name displayed on each office door). Is there any sample code or resource that can guide me in implementing this feature?
Thanks in advance for your help!
I have created a portal and attached it to a wall using the AnchorEntity. However, I am seeking guidance on how to determine the size of the wall so that the portal can fully occupy it. Initially, I attempted to locate relevant information within the demo code, but I encountered difficulties in comprehending certain sections. I would appreciate it if someone could provide a step-by-step explanation or a reference to the appropriate code. Thank you for your assistance.
In the visionOS App, I want to detect whether the user is in the room. My idea is as follows:
Check whether there are walls around the user
May I ask how to do it? Thanks!
We have a native iOS app that supports the upload and display of USDZ files. It has been working great since beta (late 2022) and live launch (late 2023) until now. But recently we had reports from some users on Max model phones (14 + 15 Pro Max) at least. When they launch tap and launch a 3D file the Quick Look player is triggered. So far so good. But for affected users the controls along the top of the player - X (close) AR | Object (toggle) and share button - are moving too high up the phone screen and getting stuck (untappable) behind the phone's top status bar (time, camera bug, connection, battery).
This means that when they open a USDZ file in AR or 3D view they have to hard-close the app to get out of it again. This doesn't happen when they open a USDZ file from Files, Dropbox etc on their phone (which also uses the Quick Look player). The controls only move up and get stuck when launching a USDZ from within our app.
I'm at a loss to figure out what might be causing this on some phones and not all others. And why only when opening a USDZ file from our app! So far we have replicated this issue on a single iphone 14 Pro Max and a 15Pro Max, both running iOS18+
We have tested on other 15Pro Max's on same OS, and Pros, normal iPhones, Minis and are not experiencing the issue. You would think that a USDZ file is a USDZ file and that your iPhone knows what to do with it and open it in the Quick Look player regardless of where you open the file from. Why would the navigation items be moving if you open the USDZ file from within our app, and why only for some select users?
We will continue to troubleshoot and test but I wanted to throw this out to the community in case anyone had experienced this or if anyone had any theories that would expedite our testing. Your thoughts are most appeciated!
Here is a video showing the expected (correct) behaviour: https://www.dropbox.com/scl/fi/0sp8s4opaf2m4gukkcbrk/How-opening-a-USDZ-should-behave_correct-behaviour.MP4?rlkey=tzzau9x91mwox66gsgguryhep&st=qiykmne9&dl=0 and a screenSHOT attached below of what is happening on one of the affected user's iPhone 15Pro Max.
We are currently using Apple's Object capture module and wonder if it would be possible to collect the following data :
Device information
Current translation / rotation
Focal length embedded to the image headers
GPS localisation information.
Information about the exposure time
White balances and the color correction matrices
We also have 2 additional questions :
Is there an option to block close up accomodation of the camera ?
Is there a way for the object capture module to take a video instead of a series of picture ?
I'm wondering if anyone can suggest a workaround for the broken ARKit body tracking in iOS / iPadOS 18.0 and 18.1?
The orientation for foot bones (and possibly other bones) is incorrect in the ARBodyAnchor returned via ARView.session.delegate update method. It works correctly in iOS / iPadOS 17.x.
The same failure occurs in a SceneKit app via a ARBodyAnchor in ARSCNViewDelegate.
You can easily verify the problem using Apple’s own sample app: https://developer.apple.com/documentation/arkit/arkit_in_ios/content_anchors/capturing_body_motion_in_3d
Any help would be greatly appreciated.
I’m currently working on a project where I need to display a transparent GIF at a specific 3D point within an AR environment. I’ve tried various methods, including using RealityKit with UnlitMaterial and TextureResource, but the GIF still appears with a black background instead of being transparent.
Does anyone have experience with displaying animated transparent GIFs on an AR plane or point? Any guidance on how to achieve true transparency for GIFs in ARKit or RealityKit would be appreciated.
When I first install and run the app, it requests authorization for hand tracking data. But then if I go to the settings and disable hand tracking from the app, it no longer requests. The output of requestAuthorization(for:) method just says [handTracking : denied]
Any idea why the push request only shows up once then never again?
I'm writing a RealityKit/ARKit app that runs on iOS.
Starting with Xcode 16.0 beta 1, at least through Xcode 16.1 beta 2 (16B5014f), in the iOS 18 simulator, my app randomly crashes in about 20% of app sessions the first time it attempts to present an ARView.
The crashes seem to occur at multiple points within RealityKit and Metal.
Below, I've included screenshots of the call stacks of the crashes, which occur as a result of both EXC_BAD_ACCESS and assertion failures within RealityKit.
The app only crashes in the iOS 18 simulator, and does not crash in the iOS 17 simulator or earlier.
The app only crashes in the simulator, and does not crash on a device running iOS 18.
Before I investigate further, I'd appreciate it if an Apple engineer could give me a sense of if these crashes are most likely the result of known issues within RealityKit and/or the simulator, or if your opinion is that there are probably bugs in my app's code.
I've submitted several feedback issues in the past, and I'd love to submit this issue too, but I expect that I would spend many hours attempting to create a repro case in a sample app. Understandably, I'd rather not spend this time if an Apple engineer could tell me this is a known issue, for example.
Thank you.
We are using the ARKit image tracking feature on visionOS 2.0 with three pre-registered images. The image tracking works, but only one image is actively tracked at a time. When more than one target image is visible to the camera, it has difficulty detecting and tracking the other images.
Is this the expected behavior in visionOS, or is there something we need to do to resolve this issue?
We are currently working with the Enterprise APIs for visionOS 2 and have successfully obtained the necessary entitlements for passthrough camera access. Our goal is to capture images of external real-world objects using the passthrough camera of the Vision Pro, not just take screenshots or screen captures.
Our specific use case involves:
1. Accessing the raw passthrough camera feed.
2. Capturing high-resolution images of objects in the real world through the camera.
3. Processing and saving these images for further analysis within our custom enterprise app.
We would greatly appreciate any guidance, tutorials, or sample code that could help us achieve this functionality. If there are specific APIs or best practices for handling real-world image capture via passthrough cameras with the Enterprise APIs, please let us know.
I am new to learning about concurrency and I am working on an app that uses the HandTrackingProvider class.
In the Happy Beam sample code, there is a HearGestureModel which has a reference to the HandTrackingProvider() and this seems to write to a struct called HandUpdates inside the HeartGestureModel class through the publishHandTrackingUpdates() function. On another thread, there is a function called computeTransformofUserPerformedHeartGesture() which reads the values of the HandUpdates to determine whether the user is making the appropriate gesture.
My question is, how is the code handling the constant read and write to the HandUpdates struct?