EUROITV2010, Tampere Finland June 9th-11th
Expectation.
From 2004 to 2010, I must have been to well over 25 IPTV or broadband conferences. I’ve presented, held a booth, run a workshop or chaired sessions. In the last year I’ve also taken up blogging from these events. But all were trade shows and the life span of their conferences is inversely proportional to the energy the organisers deploy to milking the event for profit. So its rarely a good sign when these things get too big if it’s the conference you’re interested in.
When I first heard of EUROITV2010, an academic digital TV conference, from Arian Koster of KPN, a few months ago, I wrote to the organisers saying who I was and that I was interested and could help promote the event. I was half expecting to be invited as a guest or something.
I got no answer. Of course that’s always a bit off putting; but getting past the vexation, my curiosity was piqued. At last I would go to a conference where presenters have to fight amongst their peers with the quality and originality of their presentation to get a slot. It is not the amount of money you’re paying for a sponsorship package.
So here I am on the plane on my way to a conference I’m paying to attend. The last time I paid was almost ten years ago! Actually the investment for the independent consultant I am, is only hitting home now: 500€ for the conference 400€ for travel, 350€ for accommodation and I expect at least 150€ for extraneous (I’m flying Finnair so a beer & nuts is already 7€ for starters) is 1300€ plus three days away from home & work + missing the first two world cup games. I really must get something out of this.
After a day of around Tampere, one of the larger inland cities in Finland, I Was getting an idyllic view of the place. If Finland were only what my eyes have seen so far, then this is a land of light - this time of year at least - with no houses, just blocks of flats, beautiful people whose women have exclusively blond or jet black hair. There are seagulls in the city miles from the sea, pleasant fried fish smells at meal times, really boring handsets and market butchers with large Nokia blue-tooth earpieces.
OK enough tourism, what about the content? Did academia stand up to my high expectations?
Well yes in most ways.
A trade-show conference is 30% rubbish, 30% boring, 30% OK and 10% interesting. EuroITV2010 showed a very different profile.
From what I saw there was probably as much rubbish (lets round it at a third), but then the rest was all interesting, with another third being scientific reasons that explain hunches a lot of us have in the industry, and the final third being really new, at least to me.
I attended the Quality of Experience workshop the day before the main event. Of course it had a proper academic title (QoEMCS or something), but really was just a QoE workshop. I was the only non-academic out of 12 participants.
A first researcher presented a huge mobile 3D project that must have cost lots of our tax Euros. They have uncovered some stereo Video encoding issues with a current debate on MRS Vs. AVC Simulcast. The project started using DVB-H, which was politically correct at time of project funding. “Actually a very good standard form a technical point of view” according to the presenter.
This is clearly a mid to long-term project as the main technologies considered where auto-stereoscopic which means “Glasses-free”. There are three ways of achieving this:
- Stereoscopic for single user with one small “sweet spot”
- Multi-view for multiples observers with a larger viewing region but multiple but even smaller “sweet spots”
- User tracking with various techniques – this seemed to be a bit of science fiction, I mean what if a fly landed on you nose or something …
In the work done so far, an interesting hypothesis has arisen that 2D & 3D artefacts are independently perceived. What this means was left to the listener to deduce, I suppose that must be the academic way of seeing who’s a member of the club. I’m not a member of that elite club yet and didn’t get what the implication of this hypothesis might be.
Focus groups conducted within this research program concluded that user expectations of mobile 3DTV were primarily for social gathering: to achieve a sense of “being there”. In terms of content the two genres that came up most as potentially interesting were action movies and documentaries.
Mobile 3D coding is less resilient to network error and is therefore harder to standardise than 2D.
Amongst ongoing challenges, the project team is working on an objective perceptual quality metric for mobile stereo-video, which delves into neuroscience as much as into video expertise.
Priorities for bringing 3D to the mobile devices are first the User Interface, then games, video being last.
The corpus of 3D cinema that will be available with films like Avatar will pose a challenge. They are designed for viewing from a distance. It will be difficult to repurpose them for mobile and “convert the baseline” or extend the background.
+++
Then followed a short punchy presentation of user driven adaption of H264/SVC streams. Its was one of those “why didn’t I think of that” moments.
Rather than have streams adapt, just on available bandwidth parameters, this idea is to let the user influence the adaption of bandwidth, frame rate or resolution with several use cases. When there is incomplete information on usage environment, when the user’s interest in the content varies significantly, when billing conditions change, … This is just a project proposal but so far Jordi Ortiz from the University of Murcia in Spain has considered 3 adaption parameters: varying priority, adjusting minimum values and bandwidth adaption.
+++
I said academia has its quota of time wasting and there followed a really lame presentation on QoE measurement from multiple points in the networks: several years behind the industry.
+++
The next talk was one of those “I knew that was true, but now I know why” moments. Controlled resource reduction affects QoE much less than uncontrolled packet loss. Doh! A few slides of arcane scientific explanation illustrated how the overall Quality of Experience is less affected by lowering bit rates proactively than by letting packets get lost at network bottlenecks. The researcher came up with a really interesting question: “If we knew why users leave streams, QoE tools could improve by a quantum leap.” That’s the beauty of academic research. We can ponder this even if there is still no workable way to progress this issue.
+++
Yohann Pitrey, from Nantes University then gave a crisp talk about research he’s conducted into ‘Subjective Quality Evaluation of H264 HD Video Coding Vs Spatial up-scaling’. The study clearly shows that it is better in QoE terms to downscale rather than interlace a full HD video to save bandwidth. Interestingly the study also shows that 720p at 6MBPS is close to 1080p at 9MBPS in user perception of quality.
+++
Shelley Buchinger from Vienna then gave a short talk on findings from her project on ‘Content Aware enCoding for Mobile TV’ or CACmtv for those in the know.
An interesting question was that if the minimum number of non-expert users required to test video quality is 15, how many expert users does it take to produce video quality measurement (MOS) that is as reliable? The University of Vienna study shows that you can achieve similar results with just 6 expert users & simplified procedures, but interestingly only as long as the video used to benchmark has only one Area of Interest.
+++
A presentation on using Metadata for video quality assessment reminded us of the shortcomings of the usual suspects like MSE or PSNR for determining signal quality. There was a new one for me: MSSIM, which is a Mean Structural SIMilarity index. It uses luminance, contrast and more importantly structure, to compare a reference image with the one being tested. Apparently it’s been around since 2004 and competes with another new approach called VIF. Quality impairments in dark regions are less important than in light ones. An MPEG7 semantic description of sequence helps define a region associated with the semantic content so quality assessment effort can be focussed there.
+++
Florian Wamster from the MNRG lab in Würzberg, Germany presented a project which should really turn some heads. YoMo is a YouTube “Application Comfort” Monitoring tool.
Their starting question was how to measure QoE for YouTube viewing. The worst influence on YouTube viewing is stalling when the video just stops before the end, waiting for the buffer to fill up again. The results so far use both network parameters and the YouTube API and show that network management could react if stalling seems likely. Operators could soon offer premium services with enhanced YouTube viewing.
+++
The last presentation from the QoE workshop basically explained that there’s a trade off between the quality of the image and the time users are prepared to wait for it to load. Wow that was worth coming to Finland to hear. Oh yes, there’s an HIQM metric used to measure this.
+++
+++
The conference itself kicked off with a sleek presentation on the future of content distribution by Siemens’ Marcos Gonzalez-Flower. He compared broadcasters with Venetians and warned: "the battle for the lounge is almost over and the natives are revolting". It was witty and entertaining but didn’t contain any new information.
+++
During the coffee break I saw a demo of http://www.smeet.com. It’s a lightweight Flash based virtual world particularly suited for watching YouTube videos with your buddies & discussing the film.
http://madm.dfki.de showed a video content analysis demo called ‘smartvideobuddy’. The demo was impressive for automatic tagging within narrow range of possibilities. The demo said which of 16 predetermined sports was this video about. But this kind of approach doesn’t work at all in a context where you have no idea what the video is about.
+++
Mike Darnell for Microsoft’s MediaRoom gave an interesting presentation about ad-skipping, which is the most frequent DVR interaction. It turns out thumbnail-skipping where users get to skip over each add represented as a thumbnail at the bottom of the screen is the preferred method, but unfortunately it’s bad for brand recall. Ad skipping is or course a much bigger issue in the US, I can only hope it stays that way.
+++
A “Hypernews” project by Päl Aam of Volda University in Norway was about clickable or hyper-video to go beyond the typical 90s TV news story (which is roughly 200 spoken words). The idea is to go beyond the shallow pieces without nuance. I must have misunderstood something because this really gave me a sense of déjà vu.
+++
Tanja Erdem, a rare non-academic, from Momentum, Istanbul described the complete environment including authoring tools that they have built for getting PPT and PDF on TV. Their market is eLearning and exams. For once it was something down to earth but maybe the price to pay for that was that it was that it wasn’t very intellectually stimulating.
+++
IPTV personalisation based on non-identity attributes basically means using age and gender to tailor the content. Wei-Yun Yau from Singapore’s A*STAR presented a project they are is developing with a camera sitting on top STB (could an STB-Top-Camera be a STBTC or maybe an S2TBC?) for such gender and age recognition. (http://www.a-star.edu.sg)
+++
A thought that has been germinating in my mind for a while, finally came to fruition during the last panel of the day as Roger Lay of SapientNitro gave a suave talk including a Friendly Filtering concept (actually FF could also mean Fast Forward, or skip all the junk and get me to where I want to watch). Content recommendation has been a key concern of mine for 5 years, but with OTT here at last, for the TV it’s more about getting rid of all the rubbish. Recommendation was yesterday’s battle when we thought VoD would succeed within walled gardens. OTT changes that: "Friendly filtering' is the new mantra!
+++
During the panel Jussi-Pekka Koskiranta from YLE the national Finnish broadcaster told us that live TV viewership was doing fine for adult and kids segments, but not for teens. So their WebTV efforts target them in particular, with Internet-only programs and a chat-enabled viewing experience. The service is apparently successful but we didn’t get any hard figures to prove this.
+++
Olof Schybergson, CEO of Fjord, a Nordic user experience (UX) company that was involved in BBC iPlayer rollout, told us that "the couch potato is resistant to change but technology has already taken him away from linear TV and there no way back."
Wrapping up
There was a whole stream of presentations about market research techniques. Despite never-ending process enhancement, I still feel that research based on asking people what they want remains as shaky as ever. You can always get the research to say what you want. The latest trend is about “living labs” where a few families are followed in their usage for several years. Maybe this will help change my mind on this sociological approach.
A recurring theme during the conference was giving meaning to video with:
- better use of existing metadata (e.g. EPG),
- user generated tags
- automatic generation of tags through semantic analysis. MPEG7
I had to miss the second day for personal reasons. So I’m looking for a write up of that. And of course EUROITV2011 in Portugal.