Perry and Andreas came up with this list in our new Collaborative Design Lab. Let’s just call it a to-do list.
Perry and Andreas came up with this list in our new Collaborative Design Lab. Let’s just call it a to-do list.
I spent a week or so wrangling some variations on the immersive viewer apparatus for the little research group at USC working on this project. Some of the “early” prototyping of the concept I had done last winter, described in this blog post.
Taking a hint from Mark Bolas to try some design prototyping with foam core and an Xacto knife, I figured I’d try some ideas out. I have to say, that after mostly thinking, writing and wrangling groups of people for conference, symposium and workshops, doing some tangible sort or craft work was incredibly satisfying. It was really nice to be in a situation where I was generating sawdust and sparks.
I managed to get a hold of one of the new Sony UX-180 UMPC form-factor devices (Windows XP, Core Solo 1.2GHz, 512MB RAM) an overpriced fetish device that, despite the cost, was perfect for this particular prototyping exercise. If an immersive viewer were to be as normal as an iPod this might be about the form-factor, i would guess. The unit is approximately the size of an unfolded Nintendo DS, and definitely heavier. The display is somewhere between good and excellent, imho, and despite the small size, the resolution of the screen only presents small challenges for reading text.
The goal of this little design and coding experiment was to create a reasonably self-contained viewer prototype that would allow form-factor and handling tests, figure out what optics configurations work (lenses, image size, etc.), have a platform for testing the omnistereo QuicktimeVR imagery and test software designs on this particular platform.
The “Configuration A” prototype used a huge TabletPC and a simple reflecting mirror system, together with a beautifully designed and humongous custom designed enclosure by the masterful Sean Thomas, a grad student at Art Center College of Design. The general idea was to orient the screen horizontally and use a mirror to increase the optical distance between one’s eyeballs and the screen by reflecting light. I was also guessing that it would be easier to hold the whole apparatus as if one were getting ready to take a bite out of a giant deli sandwich. Holding the apparatus out with one’s arms extended, quickly became tiring.
Second meeting at Art Center with Sean Thomas and Jed Berk to go over the first breadboard version Sean created. Placement of the sensor is an issue — it is sensitive to metal and EM fields generated by the TabletPC. We experimented with a few placement options, including in front, below in various places. Sean had the idea of a cut-out in the breadboard so that the sensor could go below, without sticking way out in the bottom. We foudn that th ebest placement for the time being seemed to be in the front, in front of the mirror. Sean showed me some of the 3D printing materials. We settled on this plaster material, which is more fragile than ABS, and cheaper. It seemed robust enough for a first version housing. Showed a few students who happened to be in the studio the app and they all thought it was very cool, so that’s, like..cool.
Sean had the difficult challenge of making something that would fit both a large TabletPC while still being well-designed enough to be somewhat hand-held and suitable for demonstrating the conceptual underpinnings of the project. He ulimately came up with a design that, I was told, was the biggest thing to come out of the Art Center’s 3D printer. I don’t know if that’s good or bad..
This “Configuration A” prototype was a great learning experience. Things like realizing that everything in the display is now backwards was a great part of the prototyping/learning experience. Realizing that I needed to find an expedient way to eliminate the wires that connected the sensor hardware to the computer was a great part of the prototyping/learning experience. Discovering that the Max/MSP patch I was using to broker the signals from the sensor to QuicktimeVR actually turned the QuicktimeVR in the “wrong” directin was a great part of the prototyping/learning experience.
So, “Configuration A” was an unqualified success in these regards. Definitely something that I could learn from and improve. Alas, I filed it away for several months an put the project on a semi-hold.
Then I started working on “Configuration B”, with a much smaller and much more powerful CPU, at least on paper — the Sony UX-180.
With more time this summer to work on the prototype, I decided I wanted to get rid of Max/MSP as the software “middleware” and code something from scratch. I felt I wanted to get closer to a solution that was as close to the operating system as possible, to maximize efficiencies and all the rest.
A quick test indicated that there was sufficient support within Windows .NET framework for QuicktimeVR to do what I wanted to do, and, from the specs, the UX-180’s graphics chipset was actually more robust than that on the TabletPC I had been using. Technically, it should work well, but it wasn’t until I got the UX-180 that I’d know. Of course, I delicately unpacked the unit in case I decided to return it.
With a little test app, attempting to simultaneously render two QuicktimeVRs (left eye and right eye for the ultimate goal of stereoscopy), I found out that the little UX-180 was up to the task.
I patched in some RS232 serial protocol code to link together the sensor and the QuicktimeVRs, so that the sensor would provide the viewing point-of-view angle and booted it up on the UX-180. Although the device would render the QuicktimeVRs without any problem, I quickly found out that the UX-180, strangely, completely balked at reading the serial data with any reasonable speed. The machine literally ground to a halt, the mouse staggered and the fan spun up like a turbine. WTF? Render two video windows with no problem, but as soon as you have a little threaded serial port data reader, reading 12 bytes at a clip, the machine stalls? How disheartening. At first I thought it was my code, but even running some very simple tests straight from some trivial design patterns caused the same problem. I have been using Johan Franson’s amazing Serial Tools kit from franson.biz for years, and it was hard to believe this was the culprit. But, I went native just to eliminate the possibility and wrote my own serial port reader using the built-in .NET APIs. It worked much better, but I don’t blame Serial Tools and here’s why: everything runs fine on multiple desktop machines and the TabletPC. It’s only when the code attempts to run on the UX-180 that there are problems, leading me to believe that there is some particular weirdness on the UX-180 that causes the Serial Tools code to gum up.
So long as I could get the code to run, even using the slightly less elegant .NET “native” serial port API, I was happy. Now it was on to some physical prototyping.
I guess I had the “binocular” form factor stuck in my head from the original Sean Thomas prototype. I just immediately started creating a similar configuration from cardboard and went to my local glazer and had a bunch of tiny mirrors cut.
My general goal for the Binocular Form Factor was as you see below. I got fairly well into it, but once I clamped it to the UX-180 just to see how it felt, a few problems arose. First, the UX-180 had to be oriented such that the “top” of the screen (as one would view it if you were holding the device conventionally), had to be towards your face because the mirror reverses everything. No biggie, except that there is a fan there (there’s also one on the bottom, but much less significant) that seems to be the busiest of them all. Not huge, but with your chin essentially abutting it, the heat is uncomfortable, to put it mildly. I thought of putting a small deflector baffle there, but quickly the whole design seemed destined to be awkward and embarrassing.
One of the problems of working alone in the back shed is that you don’t get collegial feedback and dialogue. It’s okay — that was just the circumstances of working during summer break. But had I been around others, we may have come up with an alternative — the one that struck me while I was riding my bike somewhere — what about a “periscope” style design? The UX-180 has this slide-up/slid-down screen which would make such an architecture suitable for that configuration, while also providing access to the keyboard. (One of the considerations I had was how to control various “meta” features of the viewer — switching slides, etc. Keyboard buttons and soft-switches are perfect for this.)
I sketched this out and then got busy with the Xacto..
In these images you can see a design that’s much more accommodating to the general flavor of the viewer. I got a small selection of lenses designed for stereoscopy from 3D Stereo and tossed them in there, spent an evening sprucing up the viewer software, put in a simple Bluetooth-to-Serial dongle to avoid having to run a clunky USB cable into the UX-180, and processed a bunch of test 3D panoramas, turning them into QuicktimeVRs and gave it all a whirl.
The whole configuration works much better than I would’ve expected from a second prototype. I configured the soft-key “zoom-in/zoom-out” buttons on the UX-180 to move forward and backward through a directory of QuicktimeVRs. The directory either contains stereo pairs named “Right_XYZ” and “Left_XYZ” so the system knows which one’s to pair together. (Note to self: remember that the naming scheme also contains numerics indicating the size of the slit width and the distance â€” left or right – from the image center line that the slits are taken from so I can recollect how the image was processed and what configuration of these parameters work best to deliver decent stereo acuity.) The software consistently breaks the first time it starts up (but doesn’t after that first time), which I think means that I should not manually connect the device to the sensor before starting the software. Switching from the 3D panoramas to conventional QuicktimeVR panoramas works well in software, but, obviously, a different set of optics need to be moved in place in order to see non-stereo, non-3D renderings.
Art Center Model Shop
The back story of the project tracks back to a conversation with Naimark. Working on the (wrong) impression that a stereo panorama could be created trivially, using two cameras on a rotating, panoramic rig, I was all set to make stereo QuicktimeVRs. Naimark pointed me to a research paper that indicated that such a camera configuration wouldn’t work for a panorama — the geometry is hooey when the two cameras rotate about an axis in between them if you try to capture one continuous image for the entire panorama. In other words, setting up two cameras to each do their own panorama, and then using those two panoramas as the left and right pairs will only produce stereo perception from the point of view at which the cameras are side-by-side, as if they were producing a single stereo pair. You would have to pan around, capturing an image from each point of view, and mosaic all those individual images to produce the stereo.
(I put the references to research papers I found useful — or just found — at the end of this post.)
This is the experiment I wanted to do — two cameras, capturing an image from every view point (or as many as I could) and create a mosaic from slits from every image. That’s supposed to work.
I plowed through the literature and found a description of a technique that required only one camera to create a stereo pair. That sounded pretty cool. By mosaicing a sequence of images taken from individual view points, while rotating the single camera about an axis of rotation behind the camera, you can create the source imagery necessary to create the stereo panorama. Huh.. It was intriguing enough to try. The geometry as described still hasn’t taken hold, but I figured I would play an experimentalist and just try it.
So, the project requires about three steps — the first is to capture the individual images for the panorama, and the second is to mosaic those images by cobbling together left and right slits into left and right images, the third is to turn those left and right images into Quicktime VRs.
After a couple of days poking around and doing some research, I finally decided to build my own â€” this would save a bunch of money and also give me some experience creating camera control platform for the backyard laboratory.
Camera Control Platform
My idea was pretty straightforward — create a rotating arm with a sliding point of attachment for a camera using a standard 1/4″ screw mount. I did a bit of googling around and found a project by Jason Babcock, an ITP students who created a small rig for doing slit-scan photography. (The project he did in collaboration with Leif Krinkle and others was helpful in getting a sense of how to approach the problem. The geometry I was trying to achieve is different, but the mechanisms are the same, essentially, so I got a good sense of what I’d need to do without wasting time making mistakes.)
While I was waiting for a few parts to arrive, I threw together a simple little controller program using a Basic Stamp 2 that could be controlled remotely over Bluetooth. I wanted to be able to step the camera arm one step at a time either clockwise or counter-clockwise just by pushing a key on my computer, as well as have it step in either direction a specific number of steps with a specific millisecond delay between each step.
My first try was to use the rig to rotate in a partial circle, accumulate the source imagery and then figure out how I could efficiently create the mosaics. There was no clear information on the various parameters for the mosaics. The research papers I found explained the geometry but not what the “sweet spots” were, so I just started out. I positioned the camera in front of the axis of rotation and set it up in the backyard. I captured about 37 images in maybe 60 degrees. At each step of 1.8 degrees, I captured an image using an IR remote for my camera.
There were any number of problems with the experiment, and I was pretty much convinced that there was little chance this would work. The tripod wasn’t leveled. There was all kinds of wobble in the panorama rig. The arm I was using to position the camera in front of the axis of rotation had the bounce of a diving board. Etc., etc. Plus, I wasn’t entirely sure I had the geometry right, even after an email or two back and forth with Professor Shmuel Peleg, the author of many of the papers I was working from.
With 37 source images, I had no clear idea about how to post-process them. I knew that I had to interleave the mosaics, taking a portion from left of the center for the right eye view, and a portion from right of the center for the left eye view. Reluctantly, I resorted to Apple Script to just the job done — scripting the Finder and Photoshop to process images in a directory appropriately. I added a few parameters that I could adjust — left and right (obviously), the “disparity” — number of pixels from the center where the mosaic slit should be taken, and width of the slit. I plugged in a few numbers — 40 pixels for the disparity and a slit width of 30 and just let the thing run, and this is what I got for the right eye.
You can see each slit produces a strip in the final image. It’s most obvious because of exposure differences or disjoint visual geometry. (Parenthetically, I made a small change to the Apple Script to save each individual strip and then tried using the panorama photo stitcher that came with my camera on those strips — it comlained that it had a minimum photo size of 200 pixels or something like that. I also tried running it on some other, more prosumer photo stitcher, but I got tired of trying to make sense of how to use it.)
With the corresponding left eye image (same parameters), I got a stereo image that was wonky, but promising.
Here are the arranged images.
I did a few more panoramas to experiment with, well..just to play with what I could do. Now I had the basic tool chain figured out except for the production of a QuicktimeVR from the panoramas. After trying a few programs found a program called Pano2QTVR (!) that can produce a QuicktimeVR from a panoramic image, so that pretty much took care of that problem â€” now I had two QuicktimeVRs, one for my left eyeball, the other for the right.
Why do I blog this? I wanted to capture a bunch of the work that went into the project so I’d remember what I did and how to do it again, just in case.
Tom Igoe's stepper motor information page (very informative, as are most of Tom's resources on his site. Bookmark this one but good!)
S. Peleg and M. Ben-Ezra, "Stereo panorama with a single camera," in Proc. Computer Vision and Pattern Recognition Conf., pp. 395--401, 1999. http://citeseer.ist.psu.edu/peleg99stereo.html
S. Peleg, Y. Pritch, and M. Ben-Ezra. Cameras for stereo panoramic imaging. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR'00), Hilton Head, South Carolina, volume 1, pages 208--214, June 2000. http://citeseer.ist.psu.edu/peleg00cameras.html
P. Peer and F. Solina:"Mosaic-based panoramic depth imaging with a single standard camera, " Proc. Workshop on Stereo and Multi-Baseline Vision, pp.75-84, 2001 http://citeseer.ist.psu.edu/peer01mosaicbased.html
Yael Pritch , Moshe Ben Ezra , Shmuel Peleg. Optics for OmniStereo Imaging, L.S. Davis (Editor), Foundations of Image Understanding, Kluwer Academic, pp. 447-467, July 2001.
Ho-Chao Huang , Yi-Ping Hung, Panoramic stereo imaging system with automatic disparity warping and seaming, Graphical Models and Image Processing, v.60 n.3, p.196-208, May 1998
Ho-Chao Huang and Yi-Ping Hung, SPISY: The Stereo Panoramic Imaging System, http://citeseer.ist.psu.edu/115716.html
Panotable Stitcher Program
Stepper Motor Program
In the general category of breeching the berm between 1st Life & 2nd Life â€”Â you know, finding ways in which activities in the normal geophysical world can be linked to those in the digitally networked world, I’ve pulled together some thinking into a project that I’m calling Flavonoid. The general idea is to create ways in which linkages between 1st Life and 2nd Life can be articulated more pervasively â€” not just while you’re playing your computer game or doing your email.
Flavonoid started with an idea â€” suppose I had to “power up” so as to get “points” or “coin” to work on my computer or play a game? Suppose that “power up” had to be physical activity? Or suppose there was a way to encourage sustained activity through a game mode, so that motion was accumulated and turned into “points” or some sort of reward?
The feed was created using Geotagthings. That feed can then be used for other things, like your favorite feed aggregator, but most relevantly in this case, the URL to the feed can be shoved into the Google Maps search field, from which is produced a map of the locations in the feed that Will created. Seems weird..putting an URL in a search field rather than the name of a restaurant, but it takes.. from which is produced a map of the locations in the feed.
Vince’s post pointed out that Digital Photography Review has release notes on this Sony dangly thing that uses GPS to record your location and precise time at that location. When you get home and want to correlate location with your photos, it’s a simple matter of time-synchronizing with the time stamp embedded in the photos’ EXIF data.
Jeffrey Early, a graduate student at Oregon State, has a little Mac OSX app called GPSPhotoLinker that does essentially the same thing without the marketing muscle of Sony. (You have to supply your own GPS.) It’ll take your photos and, with a little magic from an iView Media Pro template, grab a piece of map from Terra Server centered on where the photo was taken. Here’s a photo Jeffrey took at Glacier Bay.
It might be that Sony introducing this as a “Sony Style” gizmo indicates that Sony expects the current digital image sharing craze to vector into the location idiom, which sort of makes sense. If there were an easy way to record location with photographs, we’d be well on our way to the creating the Ubicamera Surveillance Apparatus â€” it’d be a simple matter to find out what happened where, when and with whom. Cool..shudder.. On the one hand, I guess it’s cool to find out where a photo was taken, but only for curiosity’s sake. I’m not _really_ interested in the instrumental measurements of that location. It’s enough that some pals had fun at a beach in Greece or whatever. I don’t really need to know the lat/lon.
Who would be interested in the lat/lon? Do you really want to make it easy for those guys to know precisely where you are?
The tension between capturing that information for ludic and/or technofetish reasons and not providing that information for its potential to become a surveillance modality presents a tricky quandry. On the one hand, it’s cool..on the other hand, the potential for abuse might offset the coolness.