Pad.ma and Mozilla @ CAMP, part II
Cinematographer: Radha Mohini Prasad
Duration: 01:38:41; Aspect Ratio: 1.778:1; Hue: 63.390; Saturation: 0.044; Lightness: 0.289; Volume: 0.173; Cuts per Minute: 2.574; Words per Minute: 113.881
Summary: Some folks from the Mozilla offices in Mountanview came to CAMP where we discussed web standards, the open web, and demoed pad.ma
You can read Seth's
blog post about the event.
In part II were presentations by the pad.ma team, including Jan Gerber and Sebastian Lutgert.
Sanjay: .. so I can send you a link that starts at 35 minutes -- we do partial clip downloads, so say you want only 5 minutes of this video, you can get that..
CAMP Rooftop, Khar (W), Mumbai
.. general discussion around dinner..
Nagarjun: NASSCOM, the body, started because of Microsoft. Adobe and Microsoft started this thing, and then they started raiding in anti-piracy efforts. That was one of the first tasks that NASSCOM did. Saying that as the software industry, one thing that is really important is the fight against software piracy, so NASSCOM started raiding on behalf of Adobe and Microsoft.
...
Nagarjun: That's the history of NASSCOM actually. But today, I remember very well what the Minister said, when we asked them, "why are you introducing software patents? Who asked for it?". They said, "the community asked for it." I asked who the community is - they said, "NASSCOM". I said NASSCOM is not the community - NASSCOM is Microsoft.
Arun: What was the answer?
N: That was the answer. They know we are not going to accept that definition, so then we told them about the dangers, and also the advantages is India doesn't follow software patents. And that moderated the whole issue. But now they want an amendment to the Copyright Act. So when that begins to happen again, I'll alert all of you, to start writing letters.
Seth: Yea, we weigh in on things like the DMCA in the United States. We'll write letters of opinion and we put a lot of thought into it. So if it comes up, as I was suggesting to Arun..
N: Maybe the other thing you should know about that we did, related to when the iPod came - there was this FairPlay thing - there was one American hacker who did this thing called PlayFair.
Arun: Right, that hacked FairPlay.
N: Right, that was a hack. And when that came, it was on Sourceforgce, and then Apple followed the guy and asked Sourceforge to withdraw the publication of it. Sourceforce being what it is, asked him to take it off. The hacker took it off and then was looking all over the World for a place to host it. He came to India and we have a site called Sarovar. Sarovar means Lake in India. The Free Software Foundation has this thing in Kerala -- that was a G-Forge instance. It works almost like Sourceforge - he hosted it there, then Apple came there as well and served a legal notice. So it was at that point FSF wanted to do a legal battle with them, saying India has no software patents and you have no right to follow them here. But there is this guy called Anand Babu, who's a hacker who does ClusterFS. Anand Babu contacted that guy, who is anonymous (the author of PlayFair). Anand contacted him and said, "you do the development, but I tell the whole World that I am the developer", and he published it with
anand@gnu.org as the contact person for this thing. So it is Apple vs FSF now, and they didn't want to get into it. That's where it stopped.
Arun: Wow, this is real software activism.... really carrying the tradition ....
Seb: We need a sound technician on Stage I.
Someone: Switch on the mike?
Seb: Ah, if its just switching on the mike, then, we don't need a sound technician.
(laughter)
Sebastian: Let me give you a quick overview of Part 2 of the evening - Jan and me are going to show a bit of the work we've done in the last few years before pad.ma, around pad.ma, and probably we're also going to show you a few things that we're going to do beyond pad.ma, and that would be the future of the online video archiving system that we're developing.
Of course, we're super-excited to have some folks from Mozilla here. Not just excited that you managed to come by, but also because so much of our work builds on the work you've been doing. And its also good to be able to show you what we're doing with it.
Sebastian: Ironically, I noticed my presentation -- it's not really a presentation, I just opened up some tabs -- it's Webkit, actually. Its totally inappropriate in a way, but also it's totally appropriate because I don't think Webkit would be at the place if there wasn't competition in browser space. We're so happy that finally after all these years -- in the last 1, 2 years or so, there is something happening in browsers, and it's actually fun again to develop for web, because there is a clear target - HTML 5, CSS3, and lots of the stuff we've seen now, that makes things possible that previously just weren't. So if you think about applications and if you think about collaborative applications, and if you think about networking, then it is pretty exciting to do this with these HTML 5 compliant browsers, be it Webkit or Firefox, and it's great that there's competition there.
Sebastian: So Jan and me are both based in Berlin and we're working a bit in between - both working as artists, as well as both working as programmers. And we've done a lot of work in Berlin on intellectual property in the broader sense, but always tried to have some practical results. So, our interest in video archiving - Jan has worked a lot on video and video codecs and Ogg Theora before, but our interest in video archiving got sparked when we were running a cinema in Berlin. And, of course, what comes with a cinema is an archive.
Sebastian: And we noticed that with increasing libraries, we needed a tool for our videos, to be manageable and accessible - just for us to handle our files, but also we thought just as a cinema is a space to share archives, that we would also want to share our software efforts in the field of video archiving. So that was the moment when we, in mid-2007, started to build 0xdb,org. Basically what we wanted is something that we were missing from, lets say, sites like IMDB. Stuff like video archiving of local archives with a strong visual component.
Sebastian: So, if you go to the front-page (of
0xdb.org), you see a bit the scope of it - currently 16,000 files - am not sure if this is readable, and I can't zoom. Ah, can zoom. So this basically means it would take 1 year 28 days 17 hours to watch all this, given you don't sleep. So its a lot of data. Its 202.947 tera pixels, in case someone cares .. a terabyte is maybe more recognizable.. so, basically, it works like .. when you search a movie archive, you want to find the obvious stuff.
Sebastian: So basically these are things you would expect from any type of movie archive, movie database -- this is now, we've searched for a director, and we've found all the Hitchcock films in the system. Now if you click on something here, lets say we're looking for North by Northwest, we get to an Info page that is pretty much standard metadata that you would expect from whatever -- Allmovie or IMDB, etc. You may be wondering how many people are working on this site, entering all this data - the number of people is zero, because one of our interests has always been to use the web not only as an entertainment medium but also as a resource for data. So what we do in our spare time, but also for this, is we scrape the web a lot. There's a lot of interesting structured data around and if it is not structured, you can make it structured.
Sebastian: So basically this is a lot of metadata that we've retrieved from a couple of resources that are obvious for films, so you can see all these references and trivia and cast, etc. But this is like kind of standard territory. We also have this box here, but I'm going to talk about that a bit later. So this is one just one of the views we have. One of the things we always wanted is to be able to do something like search in a video. Of course, there's people working on things like voice recognition, image recognition, face recognition -- this is probably one or two steps off, but since this archive runs on a backend that's a huge file server that has the video files plus subtitles, we thought it would be interesting to implement something like full-text search. So you can search the scenes also. Let's say we search in the full archive for something like 'San Francisco'.
Sebastian: And now it actually searches all the dialogue of this one year of film, and will give us results. These are now the movies that have San Francisco mentioned in the script somewhere. This is kind of nice, but maybe at this point we're a bit tired of just looking at these movie posters of the results. This is not so meaningful yet. So, what about this.. we switch to Scenes view, and since this is an archive based on the actual visual data of the films, we can do the following: it's a bit sad because its still a bit slow even with this internet.. I got a bit excited .. ya ..
Sebastian: So, now, this is - first thing I see is a bug but that doesn't matter so much .. so now we get our search results a bit more precisely because they're actually connected to the actual scenes in the film. If we wait for a bit.. you can also set this to less stuff on one page if you want to to make it a bit faster. But this is now actually the results for our full text search through the archive, and we get the precise moment in the films where this has been said.
Sebastian: So you have a still for every scene. We thought this is quite nice, to get these stills, but how about this.. it was 2007, so this is unfortunately still Flash, and we're really really looking forward to replacing it with something sensible, but what it does is -- all these are videos, so you can Preview Watch this. If it would work ... which in presentation mode it often does not..
Sebastian: Normally in presentation mode then I click on lots of them and it gets even slower .. but this is actually all video. We can talk about the setup a bit more, but if you're interested in that, we can talk about that a bit later also. This is based on a Turbogears backend and there's a local archive backend that communicates with the server, and as it happens here, sometimes the video hasn't reached the server some-how, or something is broken. Or the still hasn't reached the server, or OR or.. but, normally this is video and this plays.
Sebastian: Now, obviously, if I was searching for San Francisco, I might also want to see stuff that is shot in San Francisco. So, we switch to Maps view - we can also get all the films that actually have a San Francisco location. This is also something we're looking to replace, and make a bit nicer. Especially since it doesn't work - we're going to show you that on pad.ma.
Sebastian: One thing we were interested in is also - now that we got full-text search to work through a large movie database, we wanted to visualize film a bit better. If you have a library, if you have a book, you have enormously great spatial orientation with a book. If you remember a quote from a book, at least I almost know - this was on the right page, on the upper half of the page, something like that. So in a book you can browse - you can feel with a book whether you're in the beginning or the end. With film, its really hard - you know something is in a film, but that's it. You know it's toward the end, but it doesn't give you much. Common formats that you can buy films as, like DVD, they're not very good at navigating film, or making film browsable.
Sebastian: So what we came up with is a visual representation that is called a timeline. This is still now our query for 'San Francisco' - the idea is relatively obvious - one of these lines represents 10 minutes of film, which basically means that 1 pixel is one second of film. And, so what you see here in one block is about 80 minutes of film. And obviously the yellow highlights are now our search results in the film - so you can see San Francisco is being mentioned here, here here, and so on.
Sebastian: Here we have a search result .. so what's really nice about this timeline view is that you can at one glance get an idea of basically the visual texture of a movie. Like this documentary obviously - of course you see if its colour or black and white, but you also see a lot about cut frequency, about the visual quality of the film. And if you now I've made a list here that just assembles a few timeline. I remove the search query to show you just a bit what the kind of scope of things it is that you get there --
Sebastian: If you look at current Hollywood movies, they all look pretty much the same. But some of these are a bit more .. this is for example a black and white film that has some Sepia tone. Also you get into a lot more structural films - James Benning, very clear cuts...
Sebastian: Here you basically see that these are long, slow camera movements. If you work with this a bit you see what's a zoom, what's a pan .. and in the next version we're going to do this a bit more hi-res, so that you see more. This format for Empire by Andy Warhol - pretty dark .. this is more of a joke - this is a film we made for the archive, so this is a film that has a timeline that is a Bruegel painting. And no one has ever watched it. And so on .. so I think you get the idea..
Sebastian: So, if you, and I can show you ... let me go back to our Hitchcock results --
Sebastian: Takes a moment - so, what we've now seen is a results view that has all these timelines, you can obviously also switch a single result to this Timeline view, and there you can actually navigate this. Let's see what happens. So now we're at the very beginning of it - you can navigate with the keyboard also - this doesn't have subtitles -- probably not a very good example, because a lot of it relies on us having english subtitles for the movies and with english language films sometimes we may not have them.
Sebastian: This is also an interesting bug because the same film shows up twice, and it starts to scroll. This is very embarassing.. let me at least make it so it doesn't scroll.
So basically what you get with a single movie is that if you mouse-over here you actually can read what's happening.. if you're wondering, "hey, what's this red stuff here?", you can click on it and get the video.
Sebastian: And of course then you can search in this, so if you search for 'idea', you get that highlighted.
This is navigating.. 0xdb can do a lot more, but because it's going to be a long night, I'm going to wrap the 0xdb part up. One of the things you've seen before, and that's always a question with 0xdb is -- 'how does that scale, in terms of legal issues.. ?' So, when we started this, we were wondering if rights-holders for these films would approach us and ask us to take things down. So far that hasn't happened and in many aspects what we're providing / claiming for this is kind of fair-use principles - we're not offering films for download, we have tiny little stamp-sized previews, and its mostly a resource for research and search engine purposes. Obviously, people will have different opinions on this, and there's also different very concrete legal opnions on this, but we thought we just do it -
Sebastian: With Copyright questions, sometimes you can't wait until they resolve, and very often they will not be resolved, so it shouldn't keep you from trying out stuff. But that's why we have these buttons here. Basically, it's automated - it goes be country, and year - so recent films have a much higher risk so we have this kind of scheme that classifies these as critical or not critical, so if you're a guest and you just browse the page, you will probably only see the greenish things. Then, if you're a user, you'll probably get until here, and then, the worst is the films where we don't even know if they're recent or legally problematic. This is also why we also have to tell people always that if you want to use 0xdb, you get a couple of features when you have an account there. If you want to do a lot, ask us so we upgrade your account a bit so then you can actually access everything that I can now access as an admin. In the future, this is all going to go because we think we can do this kind of stuff, but that's the idea.
Sebastian: Now from there we move on to pad.ma. One thing, maybe now is the right time to show this -- one thing that makes developing this real fun, is THIS - (laughter) - we decided on this 2 years ago - if you visit pad.ma with Internet Explorer, this is what you get. If you're using Internet Explorer, then you have bigger problems than not being able to access this site. Jan and me were really happy when we came up with this - Internet Explorer solved in like 20 minutes. By now, we've also heard some criticism - it would be unfair, people would feel put up. I think since the statement is true, it still should be there, but of course in the future, we will inform internet (explorer) users that they can install ChromeFrame, or even better, just switch to one of these. But, technically, we things like Chrome Frame, it is that we support it with a plugin and since we are also very eager to get rid of other plugins, this one plugin that makes Internet Explorer webkit is then okay .. But, this is what makes developing it fun.
Sebastian: So, with pad.ma, the case is a bit different. Its not so much an automated archive that runs off a back-end and gets metadata and data from all over the web, but is actually a much more curated and also an annotated and annotatable archive. Subuhi, who works on content for pad.ma, will give you a short overview of what kind of material is in there, but I'm going to show you one thing that we added, which is a kind of logical extension of what you see on 0xd - lets just Browse the archive, this is pretty similar -- we have our video items here - I get to some sort of info page, you get a bit of preview in the bin here. But, the timeline navigation part has changed from there. So, basically what you get is something that resembles much more a video editting system than just a player with an extended bar.
Sebastian: So, basically, for every video, timeline is still there, but actually now the timeline has a real cursor - you can set in-points, you can navigate around, you can set an out-point, and you have marked a bit of video. And for that, you can actually go and say, add a keyword..
So you have now added this there. And if you navigate the thing, you will always just get the keywords that are at the current position. So, 'blackberry' here you exactly see - this guy here, Real Estate Developer from Bombay is talking on his Blackberry, and he does so from here to here.
Sebastian: Pad.ma has at the moment, even though its not limited to this, 4 types of metadata annotations which are Locations - location-based data, Keywords - like tags, Description - free form text, and Transcripts - which is interesting because you can from that think of subtitling, etc. So these are the things you can enter there. You have your player - that's one of the things also - in pad.ma, we're a bit in between - in browsers that support it, this is <video /> tag, but next version it's going to rely exclusively on HTML 5.
Sebastian: This is Safari, so its Cortado .. one more reason to use Firefox in the next presentation .. it should still play .. there..
So basically what we have here is spatial navigation with actual marking of points. And if you have stuff marked here you can always do stuff like 'Link to this Selection', so you actually have a really intuitive way to link into a video. The URLs are quite straightforward and nice - leave the last bit here away and you just link to a point, etc. etc.
Since we're trying to not build a system where people bury their data in, but actually a system that also provides for data coming out of it, export facilities - you can also just download all these annotations as subtitle files. You can also import subtitle files. We're trying to not build a grave here.
Sebastian: You can also - say we have marked out this thing here, you can also Download this Selection, which will give you an actual download of that sub-clip. And you can save that.
Sebastian: One of the things that we haven't really done yet, but I think it's time for Subuhi to talk about content - one of the things - there's an obvious next step in this - which we are looking forward to actually implement this year - which is if I mark up this - set an in-point, set an out-point. I have this thing marked up, now where is my Copy? Where is my Paste? Where is my new timeline, empty timeline? Of course, what would be interesting without implementing a full-featured non-linear video editing system in Javascript, is also the potential of this to actually take sub-clips, make a new timeline, and create some very basic sort of - your own mashup of things. And then, of course, export this to Theora - get this as video.
Sebastian: There's again, a lot more in pad.ma, especially the Maps sector, but I think our Google Map friends dont like us too much today. So you can also have the full archive being displayed to you - any search result on a map. Or in the future, on a calendar or on all kinds of different things. But, now maybe you want to step in for a moment and talk about the scope of pad.ma? And then we're going to briefly show you also after that what we think are the next steps for pad.ma in terms of user interface frame-work, and also - and that's where the circle is closed coming back to Mozilla, in terms of a couple of Mozilla extensions that we either have already or would love to do, to make this archiving stuff even more .. but...
Subuhi: Already a couple of mishaps - I'm only a week old, and I was anticipating very few mishaps. Lets hope there are very few. One of things I've been telling everybodyy, when I try to introduce pad.ma, is that it is an interpetive archive, and one that is working with footage and not finished films. I think when some people here footage, I think that sometimes they misconstrue what pad.ma might be about - which is kind of a dust-bin where you keep video footage, which is far from what it really is. As Seb showed you, in the annotation section is of course where you have a lot of comment going on, which I'll get into a little bit in terms of which comment we are inviting, what kind of comment we're hoping people will contribute and so on.
Subuhi: So, I think we began through also appealing to a lot of film-makers in the documentary film community, and one of the things we were trying to do is figure out - of course, you have a film, which has an edit and there are lots of things that get left out of films. So, those things which get left out of films - well, what happens to them? They stay in people's archives. So, we thought -- in pad.ma, the idea is to find those things which get cut out of films normally, and to bring them here, and share them for various reasons - for paedagogical reasons, for research, for reference - long 1 hour interviews that you only see 2-3 minute clips of in the final films..
Subuhi: And also the idea is to think of video based production as an expanded field of activity. So, who are the people who would potentially use pad.ma? It would be somebody who is publishing video that is not a film, it could be a film-maker. It could be a researcher probing documentary images. It could be a film editor who is organizing footage using the archive. It could be a writer, commenting on videos. It could be a film student working online, it could be an NGO which wants to comment on already existing video or contribute their own video.
Subuhi: So the example that Seb showed of Niranjan Hiranandani is very interesting, because what we did there is .. so, the video follows Niranjan Hiranandani on a normal day at work. He's in his car, he's using his Blackberry, he's moving from home to office and so on.. We invited, or maybe they did it themselves - an organization called Ghar Bachao Ghar Banao Andolan, which is a housing rights organization. What they did in their annotations is commented on Niranjan Hiranadani's land-grabs in Powai. One of the promises that he made is that, "I'm going to develop this area, but part of this land is going to be given to the people who live there." But that of course didn't happen. So, GBGB in their annotations, brought out that missing information, or contextual information about the video. So then you begin to engage with video in a new way.
Some of the other things ... (trying to find the annotations.. ) ..
.. outstanding bug for Sebastian ..
Seb: I'm working on a new bug now ...
Sanj: See, I've got 2 and a half seconds to avoid the other boxes and get there .. not so easy .. no, no no ...
Subuhi: And they're also providing links there to other articles, and so on and so forth .. so a very useful tool also I think for professors and educational institutions, which is something we're trying to develop - basically for pedagogy - this is one of the things I'm really trying to push right now, that professors at universities really begin to look at the video in pad.ma, not just as evidence of something happening, but as something that can be actively engaged by them, by their students, something that they can use in classroom presentation, something that they can download and present somewhere else and put into their PPT. But also video as a repository of knowledge - not just something which provides evidence. And with the textual annotation, I think that becomes much more possible.
Subuhi: So some of the other stuff that is in pad.ma right now - a lot of Bombay stuff - a lot of footage that has been contributed by organizations about Bombay, so there's a whole series of interviews that are from a film, or that were left out of the film, by Madhusree Dutta called I Live in Behrampada, and then you see the Behrampada interviews - the full length interviews.. you'll also see Dharavi Papadwali, lives and livelihoods of people in Dharavi ..
Seb: Dharavi is maybe like SVG... maybe some people don't know what it is...
Subuhi: Dharavi has been nicknamed "Asia's largest slum" - well, that I guess. And now there's some talk about redeveloping Dharavi, so ..
Someone from the audience: And the Slumdog Millionaire connection ..
Subuhi: Right .. which all of you will get .. so some of the other things that we have on pad.ma is interviews with residents in Dharavi who are then commenting about their thoughts on redevelopment, their participation in this process that is being conducted by the State, but what is their participation, how much is the State trying to involve them and so on .. ?
What was the question .. ?
Audience: What are the Remix license rights on the video on the site?
Sanjay: It's a very GPL-style license that we developed. One of the organizations we work with is the Alternative Law Forum, who also worked with developing the Creative Commons license in India. It's a bit different from Creative Commons, in that - Terms will have the details - of course it is a license, so it's legal-speak, but as I understand it mostly, if you use something, the work you produce has to be put out under the same or more open license.
Sanjay: When we were speaking to film-makers, there was a lot of reluctance to putting their footage up - oh, its the internet, we anyway don't make much money of our films, this way people will .. dadada.. the one caveat that the license includes is it is format-specific. In that, the Ogg Video you download from the website, which is the 640x480 resolution, you can do whatever you want with, pretty much. But, if you wanted the raw footage, the raw DV footage, we do, up until now, archive all of it in hard-drives in our studio, but then you would speak to the film-maker and work out terms. So if you were Pepsi and wanted it for an ad film, the film-maker could make some money off it, if you were a student, maybe different - then its just something you negotiate directly with the film-maker, if you want the raw DV footage - but the footage that's online has a highly permissive license.
Arun: How is content solicited for pad.ma?
Sanjay: That's, Subuhi - speaking to film-makers, speaking to them - right now, it is quite a personal process in that most of the people we are working with, we are working with quite closely - film-makers are often not so tecnically savvy, so it goes beyond just giving them how to use the website, but this is also how you use your editing software to cut it up a bit, this is how you grab your video, this is how you export it - it's very easy now to upload video to pad.ma, but people will still often send us hard-drives, so so far it is an extremely involved process - we are hoping to make it less involved and more open. You could upload a video, as well, if you wanted ..
Subuhi: We're also soliciting video content from institutions, so that's the other thing we're working on, in addition to film-makers.
Sanjay: I think it's interesting from a Bombay perspective, and since we're all in Bombay, to run through some of the content, and the different kind of possibilities -- we did this project last year with a bunch of kids in Jogeshwari, about their water problems, and there, the use of pad.ma during the edit process - so what would generally happen is they would shoot their footage, do some annotation of the footage, put it up on pad.ma, and then, while editting their short films, to use pad.ma as a way to reference.. I know this is something most film-makers do - they'll shoot a lot of footage, and then they'll make records of in to out points, and this is what happened - kind of Log Sheets. And, I have done this, on a film I've worked on is like spent 2 weeks translating a lot of marathi and this and that and putting it on time-codes, but then it was never used by the film-maker because it was easier for them to scrub their time-line in FCP and see what's going on. But, when you have it as an interface like this - it turned to quite a useful tool - like, "Oh, we're looking for that shot, where is it? We can just do a quick search for it, find that shot, ok it corresponds to the same time-code on my tape, I can pull it up using the editing software. So, there is that kind of work-flow benefit, for which this project was quite interesting. And the descriptions here are quite interesting - the kids from there are, just as additional text over the footage, talking about a lake that existing when they were growing up that's no longer a lake, or problems wth the corporator or whatever...
Sanjay: I think this is one example I have to show.
Sanjay: There's also the people at Alternative Law Forum - a lot of their work was Film Studies, which I think can be interesting. So, Lawrence, has taken a lot of film clips - wait, this is not what I was looking for .. or what Namita has done for example.. there's little Bollywood clips that are a lot of fun, and Lawrence is doing his PhD, so there's a lot of research over the Bollywood clips providing context, providing history, alongwith the videos. Some interesting use cases - there was one girl, Priya, who used it as a way to look back at a film she had made 5 years ago or so in college something and really take a critical view of her own film. It was really touching, her response to the website - this dimension of opening up, allowing text on the film, allowed her to look at her footage and present it in a different way, which I thought was interesting.
Sanjay: We have a LOT of footage on Bombay, so for techies, programmers who are kind of interested in doing like mash-ups with the content - interested in recontextualizing, we should definitely talk. I think its something that's potentially interesting for people in Bombay - even if its something like looking for videos in your locality and then adding written material over it, getting people from an area to comment on a particular piece of footage, whatever it is, I think its kind of clear what the feature-set is and what things can be done. We could look at the current API, but I think it's more interesting to talk about what's coming, and I think the API and everything is going to be cleaned up a lot, and I'm going to go back to Jan and Sebastian to talk about that. They're doing mostly a complete re-write of the software, so I think its going to be interesting to get an idea of future development..
Sebastian: One of the main reasons we're doing something that is very close to a re-write is - right now, for example, 0xdb and pad.ma are two totally different code-bases. And we see even 3rd and 4th types of use-cases -- imagine film studies departments who would say, "hey, I want a body of film, all this cinema, but I want to do annotation on top of it.", or imagine something that is a bit more like YouTube - just upload your stuff, anything goes, but with these annotation facilities. So we want to do a common frame-work for video archives that combines these possibilities. And that also means - the structure of this is going to be there's a Django back-end, there's a Javascript front-end, and there's JSON being passed back and forth and that's it. So far we're still talking about 'website', something that has page reloads, etc. etc., what we want to build is actually a web application in the actual sense of the word.
Sebastian: This kind of development is exploding at the moment - every application you can think of is being re-written in Javascript as we speak, or even compiled to Javascript - written in some other language and then compiled to Javascript.. so we took a long look at a lot of frame-works that would have allowed us to do that -- this is just a selection of stuff we looked at, and some of these are nice, some of these are problematic, some of these are exciting ... ya?
Q: What do you use for search in the video back-end?
Jan: It's using the Turbogears Python frame-work and then a MySQL database and in pad.ma, there is also SolR used, but in the end this is slower .. searching is faster, but editting becomes slow. So, that becomes a problem. In the future I think I will switch to only use a relational database.
Sebastian: We're also going to try this time to release a lot of the base stuff as standalone libraries - both on the front-end as well as the backend - all the stuff that does the spidering - both the python libraries and the javascript libraries. So what we noticed when we were looking at this user interface stuff is that the main paradigm still seems to be in 'widget' space, that you have an HTML page and you enhance some element to make a widget out of it, and maybe even have some magic markup in the HTML that parses your options, etc. and this for various reasons we didn't like so much. So our idea was - what was missing and what we didn't find, I mean ExtJS and YUI could provide some of this, but we needed something relatively special - is to do an actual web application frame-work in javascript, much more than a widget frame-work. Because once you've written a couple of widgets, you notice 'oh, these things want to interact with eachother, so you need an event model that knows stuff like focus, for example, which is not the HTML / JS 'focus', but is actually like, the element that currently listens to your keyboard, and all this kind of stuff. So you want the application stuff around your widget.
Sebastian: And, we also want pad.ma to behave more like an application and have a user interface that is more like an application, because people are more familiar with generic application / user-interface standards than the relatively idiosyncratic way we do it in pad.ma. So one of the early demos, this is no longer the code-base we're working on, but just to give you an idea -- looked something like this - that's where you go with widgets - we have these split-panes, we have stuff that reacts to it, you have your collapsibles, your stuff... you have your menus to actually group all this functionality in place, and once we have full screen you will even only have one menu on the screen, which I think makes it a lot nicer.
Sebastian: But, in terms of widgets, you actually, for an application that people use regularly -- if somebody actually used this for research purposes, they would work on this a lot, you actually want a couple of features that your normally find in like file managers on your computer - so for example, you want full keyboard navigation, you want to be able to hold down Shift. And you want to do stuff like type-ahead - India, or Senegal .. you actually want to have full keyboard integration and mouse stuff.
Sebastian: Another thing that we really want to get rid of is pagination. If you go on 0xdb, you get a list, that tells you like 83 pages, and this is not nice. A lot of blogs have implemented these things like where you scroll to the bottom of a long list, and then it reloads, but I think its only like a next page button that is implemented as a scroll-bar, which is also not what we want. So we have like 8000 movies - we actually want a thing that doesn't paginate. So, even though this has loading times, it is pretty nice. If you just scroll through 8000 entires in this big table here, you can do this. And you can resize stuff, and change sort order, and you can actually use this. And you can do it from the respective menus, etc..
Q: So did you write your own library, or is this a mixture .. ?
Sebastian: We do write our own library.. what we're going to do is - it would be really nice if jQuery had a good UI - for our purposes, usable UI library. Still the paradigm of jQuery-UI, though it's kind've nice, is not exactly what we want, so what we're doing is we're writing widgets that wrap jQuery. So if I have my fancy widget, I can just call all the jQuery stuff that people like, like appendTo, or click handler, or whatever, and it would automatically bind to that element, but it will also return me my own widget, so I can still chain stuff. So I think in this way, we get best of both.
Q: Do you know what the status of ThunderHead is? The thing that's underlying Bespin...
Sebastian: Bespin, ya..
Q: It's trying to do the same thing, I would expect..
Sebastian: Of course - Bespin, for those who don't know - is a collaborative, online, text editor with a focus on coding that is fully implemented in canvas.
Q: Yea .. the only drawback to that is canvas is not really accessible .. I think what you really want is accessibility, I would imagine ..
Jan: But there is also video and this other HTML functionality - so we don't want Canvas for most of our widgets..
Q: Exactly ..
Arun: I think his point is that the ThunderHead library is open source and you can play around with it.
Sebastian: I think, for now, for the next 1 or 2 years, is write application interfaces and abstract from HTML, because no one wants to write CSS and HTML to do this - you will never want to write a <div /> if its to attach your widget - its nonsense. Its not pleasant, if you really want to write this stuff.
Sebastian: In the process, of course you will have - once you have HTML 5 video our scenes views can look very different - this is just a search result, you can do actual stuff - you can do all kinds of transform stuff - I don't even have a purpose for this, but this kind of mirroring thing. This is what we can do - you will want to have, once there's full-screen, you will want to take, say this is a pad.ma page and you have a small video, you want to be able to open this up, and have a video player that has a graphical timeline that you can actually play with - but you also want to kind of full-screen it, you want to be able to - just like what you would expect from a video player. You want it to keep aspect ratio..
Sebastian: Same goes for the timelines - of course, we have timelines already, but they're relatively static - what I want to do in the future is just have these on the page and play around with them, have a bit more info on the actual timeline when you mouse-over it, etc.
Sebastian: We definitely want better maps. The new Google Maps API performs better than the old one, so we definitely want maps that are more interesting - annotating stuff on a map is really a pain, so, since Google Maps provides reverse geo-lookups, you can also just zoom into Bombay and look what's there.
Sebastian: So we have Bombay, but if you zoom in a bit you get that, much more detailed .. a little optimistic, but ..
Sebastian: So, these are a couple of things we want to do - ah, one reason we're not showing this in Firefox actually - there is a good reason for it - if you look for the UI demo, what you actually see here in Safari, is the scrollbar theme blends with our other, like, let's say here - let me just click on this a few times just to see what's going on - so now we have all these strange input elements, and widgets and stuff, and now we have these interfaces that resemble them -- if you open this in Firefox, I think this is my favourite missing feature in Firefox - you get, if you have this page here, and it all works, but what you get here if you click on it, is a scroll-bar that Apple invented in the early 2000's. Back then, early versions of OS X, actually had this glossy candy-like operating system - even Apple isn't so happy with these things, but if you do your themes.. and of course you can switch themes - like you can make this dark and you have it - then you also probably want to take care of these elements. This is one of the things that at the moment is nicer in Safari. And it's pure CSS, so there's nothing you have to hack for it.
Sebastian: And you can now of course also implement scroll-bars in your own way - I mean, scrollable things, but everyone who has ever thought about scroll-bars - you actually do not want to implement fake scroll-bars in a browser. If we could not do this, I would be very happy to not do this.
Sebastian: Still, to finish my part, and then Jan is going to take over - what we think of this video archiving software is that it is collaborative - that it can work on multiple back-ends. And in terms of application paradigm, I think what is a good and popular application for media libraries is something like this (iTunes). Forget about the eye-candy, but in terms of .. lists of your things, your lists of lists of things, certain graphical representations of things - you can click on stuff - quite okay power features. This is how many people manage music and it's kind of nice. So, if we had this type of approach and a bit in terms of user-interface paradigm if we had just like things you know from a file browser, like you select your thing, hit space, you get some sort of preview.. and you can just hit space and it disappears .. these kinds of paradigms.
Sebastian: The main thing when we were thinking about iTunes was iTunes before they turned off sharing. What about if you could just.. and with the Mozilla extension you could, specify a local path where your archive resides, and then actually the extension would take care of extracting stills, producing the timeline, or to be even simpler - encode the video, upload the video, and then on the server you can get all this visual media, metadata, etc., but from the user's local file-system. While we're at that, why not think about what would happen if I would now just drag my video file on this.
Arun: Wait, that is possible ..
Sebastian: This is a good moment for Jan to take over.. now we're getting into extension territory ..
Jan: Yea, I mean - for us we want to drag it into extension space. Again, there's an open bug for it, right now it's not possible - Firefogg soon should be able to take over the file you have dragged on. Right now you would have to load it into memory and then you can use it in the extension again. Because I don't want to upload the video someone drags, but I first want to transcode it and then upload the results.
Arun: I see.
Jan: Yea. I wrote a patch for it, but its waiting for review.
Arun: I just wanted to make sure you knew Drag n Drop is available through the FileAPI.
Jan: Yes, using that you can already drag an image and then upload an image.
Arun: Yes.
Jan: But, in our case we want to process the video..
Arun: Ah, its not available to extensions, I get it .. its not available to Firefogg..
Jan: Yea. I can only get the data, but not the path to the local file..
Arun: Got it. That's our next.. we're working on it..
Jan: Yea, there's a patch, it just probably needs your review, and then it's in..
Axel: Doesn't ... (?)'s demo do local editing and then uploading .. ?
Arun: Yea, it does do local editting..
Jan: But not in extension space.
Arun: I think what he's saying is he needs extension space to get access to it, so that you can actually get the data, maybe transcode it, and then upload it. It's not available to extension space ..
Axel: Aha, okay.
Jan: On pad.ma, there were several things that didn't work initially - we had to use plugins for playing back the video, but there were also certain other things that we wanted to have - one thing is to be able to work with the site in low bandwidth conditions, and in the end, the video is sometimes too big or you want to have it in a higher quality, so one extension we wrote is a way to select a local file, and use that on pad.ma itself. So I have this video I downloaded - you can download the videos from the site, or you got it on a USB stick or you uploaded it before, so you already have it, so why would you want to ...
So I do that, and it opens the site itself, but the video you are using is then local, and that means if you play it back it behaves much faster than using remote sites, especially if your network connection is not so fast. But also if you're working on transcripts usually it should make it easier.
Jan: So this is like a simple extension that allows to extend this work-flow. The other part is uploading videos - we only accept Ogg Theora videos, but how would you get your video into Ogg Theora? There are tools for it - not so many graphical. Sometime ago, I wrote a command line tool called ffmpeg2theora, which is used a lot, but then many people editting video don't really want to open their command line, or they don't know they have a command line or how to use it to encode videos. So, some-how this should also happen in the browser - there are also other sites interested in using this, so this became a more generic project which is Firefogg, which allows you to specify on your own page, which settings you want for people that upload a video - or if you upload a video yourself to a blog or so on, to define the settings you want, and then each time you select any video - it could be MPEG-2, or it could be a DV file, best quality you have. There is also no need to make a small video. And then encode the video in the settings you want to use it on the site, to display it in HTML.
Jan: And, then the video is encoded in the browser - with a progress bar - and is uploaded to your page. There's a demo here where you can also encode local videos, if you don't want to upload it, but you want to encode it locally, you can use this to put settings - you can also set in and out points.. maybe we should do redo this site at some point with Oxjs once we're done with it .. ? This so far is out of working with Wikipedia on their upload functionality ..
Jan: So, you encode it, select the file where you want to save it .. lets save it here, and its not working transcoding it - while its doing it, you can also see what's happening. So, in this case, it's not uploading it anywhere, but once its done, you see the video inside of the browser in a <video> tag. You can play it back, you can drag it from here to somewhere, or you can find it in the location you specified before.
This functionality is on pad.ma. Videobin is a small page where you can upload videos if you want to send someone a link. On Wikipedia, on Commons, you can also use it to upload videos.
Creating the video was one problem - uploading videos to websites with a POST request was another big problem, which usually fails if you try to upload your 150MB video with POST request in a form, especially if you have a shaky connection and after 99% it stops and then you have to do it again - this is really annoying. Firefogg includes a chunk-based upload approach, where you upload the video in 1MB chunks, and this happens while you are encoding, so not only does it make the uploading more robust and the server doesn't need to have these long connections open all the time, you just upload 1MB once it's done, but it also happens at the same time as you encode, so the time needed for that is reduced.
Q: What is the type of each chunk, as POST data?
Jan: So .. encoding at the same time .. basically you initiate the upload, you get a URL to which you send the chunks, and its a form request with 'chunk' as a file. Its basically a multipart form with a file chunk and you just send 1MB after another, and in the end you say, "Ok, I'm done", and the back-end can process this. There's a Django example doing this, and if you're on this page, there is a repository with a Django example and a PHP example doing minimal server back-end for that.
Jan: This is how the Javascript would look like that you would embed on your page .. in the end, this is what you would need to include Firefogg upload into your page. Right now, for Django Admin there is also a module you can use to enable it on any file you have in there. It would be interesting to include it in other CMS systmes as well. You can find documentation on the API here.. something that was mentioned earlier is the ability to render videos with Firefogg - maybe the canvas - I didn't think of rendering a canvas, but wouldn't it be great to have HTML as a rendering resource, or the browser, not just the canvas element, but anything I can composite in a browser, and render this as video. So, there is an API for this in Firefogg as well - to render - so, you have a <div /> in your HTML page and you can tell Firefogg to grab this as a frame and then you can with that programmatically create a video from any HTML page.
Some videos, an audio track, and now I render this ... and so its taking the videos in the order - I just changed that - there's a <div /> element on top of it - I could also just drag this and then later on in the video we will see that, what I did here - when I dragged this here - I'm not sure whether that's a feature or not, but it would allow something like screen-casts from websites.
I selected an audio track there - the API also allows you to select multiple audio tracks and add silence at certain parts. But you can also do it without that. Once this is done.. we can open this.. that's the wrong one .. and, now have that .. as you can see now, the thing also comes faster than I moved it, because in this setting it wasn't rendering in real-time, but maybe for screen-casts that can be worked on. Ok, that's Firefogg.
Sebastian: The file system is probably next...
Jan: There are some issues with that I will mention later regarding drag n drop and Firebug and Firefogg not working together .. but, otherwise .. any questions .. ?
Arun: Aside from some of the debugging that needs to happen, what's the future development plan, where you see this going - features you plan to execute .. ?
Jan: One area which there was an attempt already would be the ability to record audio and / or video - to include that as an ability - so you could say just record your video, or make an audio comment, or audio recording, and then post that to a site. So its not just for selecting existing files from your file system, but recording things. Not necessarily live broadcast - it could also upload while you're recording, but more for making a small commentary..
Seth: Have you by any chance met the guys from Miro who are starting a new project called the Universal Subtitles Project .. ?
Jan: Yes, in fact I've talked with them about that and I think it came out as a result of them seeing what we are doing and Sanjay also is working on a subtitling .. - we also have other extensions which are - I don't have them installed, but maybe Sanjay can talk a bit -- some of them were not really extensions until yesterday, but we have some tools to do rough or fast translations and subtitles as well, with a video, which have been used before things are imported into pad.ma. To do the transcript track, people use this instead of pad.ma.
We are not sure if this is something that stays like this or should be tightly integrated in the next version of pad.ma. But, yea, saw the work from Miro - there is communication.
Axel: Question back on pad.ma - can anyone go in and use pad.ma to upload and work on their content?
Jan: So right now, if you upload your content and work with it, that would work without any problems - it would not be visible to anyone but you. So there is a moderation, a process... - not many people have uploaded on their own. We didn't have the uploading button and interface enabled for, I don't know exactly what the reasons were - it was there, but we never really enabled it -- so, you can upload material, our interest is that the material should also be annotated as well - so at least for now the material that is visible on the main site is tightly annotated. So, if you just want to play with it you can also do that - it doesn't show up for anyone else. If you want it to become public, you have to some-how contact one of us and tell us you want to do it, or you can do it, and then we put it public. That is right now the thing - so its not a place where you can just upload and it will be public.
Sanjay: But, you can send that URL around.
Jan: Yes, it just doesn't show up in the search results.
Sanjay: Yes, it doesn't show up in the search results until an admin hits the publish button.
Sebastian: It's an obvious application - we never really thought of the success of pad.ma in quantitative terms, always in qualitative terms - but obviously there is also quantitative.. there are quantities - and I mean, the amount of video that's on the net and still going, we're not thinking of capturing any portion of that, but hopefully in a future version, you will, whatever the project is called - you will have
sandbox.0xdb.org or sandbox.pad.ma where you can just upload your stuff and play around with it, and that can grow
Jan: And there are also several groups that start to have their own instance or look into working with their own instance. Right now since we are working on the new version it would be better to use that, but again, you have your install (@Dr. Nagarjuna), and there's several others that have the current system installed and work with that. So, that is also an option if you have more material. We're right now not a YouTube-like repository of random videos.
Arun: Whats the source licenses?
Jan: So, mostly it's GPL, So, pad.ma is GPL v2 or v3. I think it's older than 3, so it must be 2.
Nagarjun: Pad.ma is GPL v3.
Jan: Then I updated it. So, it's GPL v3 - there's
http://code.pad.ma .. I can't type with one hand.. that's the pad.ma code-base, thats the local extension .. an old annotation client, soon there will be a current transcribing extension .. so let's just look - GPL v3. The other repository is
http://code.0xdb.org/ where the current work is. Right now its still called oxdb, it might be renamed. This is the current code-base for the back-end, which is doing many things, but have to do a lot of work on this in the next months.
And Oxjs is the javascript frame-work and then there's a set of tools - oxdjango is a set of tools that are more generic that don't fit into this project, and oxlib and oxweb are also behind the old 0xdb code-base. oxlib more like a web-browser maybe, in Python, where we have basic caching and parsing functionality, and then there's a set of Python interfaces for certain webpages that can be used to read, or used in scripting environments.
All these are GPL v3, so you can use them if you like, and contribute patches, file bugs.
::Applause::
Jan: Any questions, or any instant presentations of other things .. ? Or ideas of things to do .. ?
Seth: Thinking more largely .. if you guys want to do things like use the Geolocation API in Firefox to layer in a location-based thing if you wanted to find all the movies that are shot right here..
Sebastian: So far we have a script that parses whatever the providers - Maxmind or something that do the same thing, but its really nice to offer it in the browser API ..
Seth: It struck me as really interesting - apart from being a research database for movies, from an end-users perspective, anyone who might have an interest in movies and is in a city and wants to find out what movies are from here ..
Sebastian: No, maps are really interesting - we want to do 2 things - one, parse all this location data and event data. Location is stuff you can put on maps and events is stuff you can put on a calendar, that can be dates, events - Emergency, Second World War, Renaissance, for instance - so you can basically replace your list of movies with a map, but also with a calendar.. I think that could get dense and interesting.
Seth: Another thing I thought was interesting - I noticed at the bottom of one of the pages you showed - you had North by Northwest and you had a list of movies that spoofed on it, and you mentioned Simpsons and the Big Lebowski. It might be interesting to have a map or something there - to say "Wow, Big Lebowski spoofed North by Northwest, what else did it spoof .. ?", and you can maybe call it up in your DB right away..
Sebastian: It's a bit of an interface question of how you do this, but if you could provide an API or something that is more a command-line thing that lets you take data that is mostly IMDB and put it in useful form - they put a lot of open stuff, like Freebase, for example - something like wikipedia for data, in the broader sense. We're dealing with movies here - we're dealing a lot, also in other projects, we're dealing with a lot of large datasets, all have to be extracted from somewhere - ways to make them more accessible and provide APIs to them. We didn't demo the API for the next iteration of this, but it's really nice...
Jan: A self-documenting API .. that's the current backend.. the most interesting one is the 'find' function, but yea..
Sebastian: Of course, this site doesn't have to render stuff as HTML ..
Jan: auto-complete for search, as well ...
Arun: We actually have a fourth Mozilla musketeer who's waiting for us to get back, so maybe we should wrap up ... firstly, thank you everyone for coming. We brought t-shirts for everyone at this event, so please grab a t-shirt, I hope the sizes fit you. I think we only brought one size..
Seth: Nope, we got big and small..
Arun: And I just did a quick count, and I think everyone will get a t-shirt, and if you don't, well, I'll try and send you one, or I'll come back - but I think everyone gets one.
Secondly, the three of us are probably really interested in keeping in touch with everyone who came to this event, so if you remember, my name is Arun - my email address is
arun@mozilla.com. Some of you spoke to me about websites in India that block access to Firefox. That's a problem that I'm really personally vested in solving, and I think all of Mozilla really is. We have some statistics that say we're at 30% of the Indian market. And some statistics that say we're not. So, we dont have a clean idea of how many people are using Firefox. We should improve on that, but, I have a mailing list at Mozilla that I'm the admin of, called
community-india@lists.mozilla.org. Its a very low volume listserv, so dont worry - you're not going to get a lot of spam or anything - its just about Indian events that pertain to Mozilla. Like if we were coming on a visit, or if we're doing a technology event, or if we're speaking at the Indian Instt. of Science, or if we're visiting the IIT at Powai, or something like that, you would know of what's going on and about technology that's emerging..
Arun: Its a very low volume listserv and I'd love for all of you to sign-up, but email me and keep in touch, and thank you very much for coming. And, thank you to you guys, Sebastian, Jan, thanks especially to Sanjay, for hosting us, getting us the food and all that..
::Applause::
Dr. Nagarjun: I want to make an art-work. If you could put all those clips here - I want to take a picture against the time-line, and then all of you could stand here, and I want to take a picture from here....
Sebastian: These browser wars we win...
Pad.ma requires JavaScript.