A blog about Open Source, my work at the Gates Foundation and those I am fortunate enough to collaborate with
You can scroll the shelf using ← and → keys
You can scroll the shelf using ← and → keys
OK so I wouldn’t blame you if you haven’t heard of the Data Use and Reciprocal Support Agreement, or ‘DURSA’ I hadn’t either until Steve Midgley turned me onto the fact that David Riley’s work on Connect was predicated on a constructive legal framework capable of supporting both federal, state and commercial actors. No small undertaking, particularly when you consider that Connect! was launched with 15 private sector companies and 15 federal agencies (that’s at least 30 lawyers, all of whom would rather the whole topic of exchange of sensitive patient data across open networks just went away)
In this post I just wanted to share what DURSA is, and why it might matter for education. I am not a lawyer, so what follows represents a layman’s view.
What is DURSA?
DURSA is a multi-party legal contract that supports the nationwide exchange of both generic and sensitive Health Information across one or more public and private networks (‘NHIE’) at a variety of levels including local community, region, state and federal.
What problem does DURSA address?
For a long time health information was exchanged point to point. A somewhat crude and basic arrangement but the trust framework required was at least straightforward. You knew who you were dealing with and in all probability had a document that memorialized that relationship. Unfortunately after one or two of these, it becomes increasingly burdensome to establish and then maintain the resulting web of relationships and reciprocal use requirements – not to mention liabilities. This was therefore never going to be a sustainable or scalable model for the exchange of data over ever broadening data networks. Social Security and the Veteran’s Administration, two of the most advanced data networks in the public sector realized this early on. Both have far-flung networks and both lacked the infrastructure to manage linear let alone exponential growth in point-point data exchange agreements. Both were looking for a means to foster data interchange between organizations that may not be known at the outset of any network
Why is DURSA noteworthy in the context of exchange of sensitive data ?
Due to its nature, DURSA serves as a basis for a community of data exchange to emerge and sustain itself because its contract is predicated on a set of values and ethics that community members share and are therefore willing to be bound to legally. In its contractual form, DURSA therefore enforces specific rights and responsibilities in support of HIE. e.g. parties agree to
form of self-governance manifested in the form of a coordinating Committee. DURSA creates this committee and party’s sign up to follow the requirements and sanctions of the committee. Important to note that DURSA asks signatories to go beyond being just bound by a contract – DURSA embodies governance by consent of the governed.
There still is no legal authority in Federal or State law that creates any type of binding governance authority over data exchange. DURSA, for now, is the only game in town.
What is noteworthy about DURSA itself?
Recently had a fabulous conversation with Tony Galluscio Healthcare Solutions Product Line Manager for Harris about his work with David Riley, Vanessa Manchester and Brian Behlendorf on the Health Information Exchange known as Connect! Purpose was to understand the journey he and his team took in making the shift to Open Source with a broad community.
Fair to say that there were some clear commonalities that popped out during discussion between our work at Gates on the Shared Learning Infrastructure and the Connect project. You’ll also see themes that will remind you of some of Karl Fogel’s guidance from his book on OSS production.
Here are the highlights:
What were the key actions Harris took to organize for Open Source?
When did Harris make the decision to open up? Did they wait till they had a critical mass of code/stories addressed before doing so?
How did Harris organize for distributed agile teams?
How hard was it for Harris to open up the
work, given they were developing a secure PII exchange network for 26 different
Harris had some early concerns
How did the codeathons work out? Wasn’t it a challenge given there were competitors in the room?
Codeathons exceeded expectations and have since taken on a life of their own (for the good)
Key contributors to success:
Once the code was made public, what actions did Harris take to sustain momentum among developers while continuing to advance the larger project?
Finally, wasn’t there a risk that the community would take the code in a completely different direction from the core effort?
That is always a possibility with OSS. However there was one instance which started out bad and ended very well. This concerned development of a branch of the code to deal with query retrieve gateway capability. The work quickly attracted a vocal base who felt that the branch represented a lighterweight solution to the entire project. There was noise and debate in the developer community for a while, not always constructive. EOD Harris ended up incorporating the code into their Florida HIE project. Led to a win-win. Wouldn’t have happened without the fork
I wasn’t sure how to title this one (after all, it’s a 03:30:00 keynote!!!!) There’s so much in what Sinofsky lays out (summary @ 01:42:52 in the video. WinRT = equivalent of AMZN primitives for native apps. From a business model perspective his team has made Windows into a software-licenseable equivalent of a “cash-and-carry”. It’s a smart, smart preemptive move on behalf of the OS) Even Mossberg caught on to the fact that Windows 8 is the biggest step fwd since W95. “Touch-first” or “Tap to sensor” anybody? (wave my business card over the PC screen and it goes straight to my blog)
When Sinofsky talks about reengineering from the “kernel on up” and “fundamental performance at the base kernel level”, he’s talking about improving a literal platform, not the hackneyed figure of speech that gets tossed around these days. Cold boot in under 8 seconds and then straight onto your web apps (not forgetting the hardware accelerated graphics)
What said, note that the seamless in-page UI experience is all Apple, while the equally impressive system experience is uniquely MSFT. Care to take bets on which company will capitalize on all this advancement with consumers??
EOD, it is exciting to imagine what a teacher’s start page might look like powered by something like the SLI API inside of Win 8 (or Lion OS++).
Not exciting enough?? Well imagine that capability coupled to this http://www.bbc.co.uk/news/science-environment-14900800. Want to log in to work? You lie sucka!!!!
Can’t resist: final note (a prediction) – Sinovsky will be the next MSFT CEO and will be responsible for breaking the company up and unlocking 100%+ shareholder value.
Final, final note: you don’t need to watch past ~01:50:00. Although if you want to see a developer geek out on his own geekiness fwd to ~02:45:00.
That’s it. Doodles passed out long ago. Off to bed.
Wanted to pass along this simple 1-page framework for data standards that my colleague BrandtRedd just released under CC-By. Given fuzzy language and confusion circulating in this space within verticals like Healthcare and Education thought it might help clarify the stack that is standards. stevemidgley has proposed – correctly in our view – that you could probably unpack the 4th layer – Protocol into a few layers of its own.
Brandt – take that as a request for v2 :-) Thanks to you and Sally Askman for this contribution to the discussion.
Nice interview featuring David Eaves at OSCON talking about how the Bugzilla team is using community performance data to inform improved community management practices.
Developer experience is everything, and it can make or break an OS project. People either enjoy their coding experience or they do not.
David points to Github as a real innovation in the approach to OS development. It lowered the bar to entry and decentralized the code management process somewhat so that forking could occur in healthy and explorative ways. Just because a fork occurs isn’t a bad thing. The product that results still has to be defended with the original community and stand on its merits as a workable piece of code. HOWEVER, Github or Bugzilla alone do not make for good community practice. For that David argues that you need to use the data being generated by all those users to infer whether their experience is measuring up to their expectations.
Two immediate areas of the UX that David points to as in need of improvement: Shorter on-ramps and lower transaction costs to code commits – how long does it take to get up to speed on a branch or sprint? what is the lag between patch contributions? how long is the code review process? David singles out the latter as a real villain of the peace and he explains why at Bugzilla they have decided to track code review at the project, module, bug and user levels. By doing so Bugzilla can set developer expectations ahead of a commit, while at the same time tracking manager execution of the review process itself.
David closes with a wonderful example from LA where the city authorities opened up access to health inspection reports. Bloggers and Media outlets like the LA Times quickly got hold of the data and started mashing it together with Yelp and GoogleMaps to present healthy eateries in neighborhoods around the city. Can you guess what happened next?
Er, this was a debut?! Holy cow can Luther Allison play! I keep coming back to this album for its synchopated blues style and the bittersweet tones Luther gets out of his guitar. I mean this guy just explodes! He crosses forward and back between Blues and a Rock and Roll style. You’ll hear Chuck Berry in there, fused with Jimi Hendrix riffs. If you’re hesitating about the Blues, this is one album you ought to listen to before reaching a decision.
While at Oscon Danese Cooper and Brian Behlendorf kindly invited me to sit in on discussion of a new initiative that Jim Fruchterman, Gerardo Capiel and the team at Benetech are cooking up to mobilize OS developer talent against the societal issues we face in areas like Human Rights, Healthcare, and Education. I was very glad that I did and I am hoping you can contribute to my thinking about how we at the Foundation can help this very worthy initiative.
Gerardo is Benetech’s VP of Engineering. He’s motivated, passionate for the cause, and a marvellous bloke to hang out with. In 30 minutes I was given 3 demos, the history of Benetech, and a nagging sense that if we could find some excuse to pair Benetech’s savvy with Gates resources then crazy good things might result.
Gerardo led off the discussion by painting a pretty compelling picture: Foundations and other NGO’s could publish issues in need of developer talent. In return, developers would have access to new opportunities to diversify their programming skills and build profile through contribution to critical issues. What would make this different from Guru.com or similar source exchanges is the ability of the venture to leverage its brand in order to present meaningfully curated opportunities to the community, rather than just random job posts. Second, Benetech would leverage its network to pair up sometime significant Foundation resources against shards of developer time that are going to start to emerge as a result of the new paid time off (‘PTO’) policies being instituted at place in the Valley like VMWare.
From my own experience I am somewhat familiar with the messy and frankly limited-value experience developers risk getting from donating time to non-profit software projects. In my case I acquired it at MSFT while organizing a skunk works OS project housed in Codeplex which was aimed at producing a lightweight SMS-based reporting app for health extension workers in Ghana. It was sort of like how I imagine sport sex to be: anonymous, casual and in the end, naggingly unfulfilling. Everyone exited the project unsure of their ultimate contribution and wanting to feel better about it than they actually did.
This is where a group like the Gates Foundation could bring resources to bear on behalf of Coding4Good and raise the quality of the developer experience while contributing to the likelihood of sustainability for the venture. I wanted to give you a sense of where I think a Foundation could contribute (with thanks to Theo Schlossnagle for validating some of these points during the discussion):
Leverage the Foundation’s megaphone on behalf of the developer community – getting the word out and sustaining it can be an expensive proposition these days. Foundations will have resources for the community to use on its own behalf
Budget for the time volunteers will need to ramp -It’s notoriously hard to get a resource spun up in a timely fashion. Providing a developer with context for why their contribution is needed (and valued) can make the difference between the project enjoying the efforts of a committed and motivated developer versus a work-for-hire mechanicalturk
There is no substitute for a full time PM/ScrumMaster – I learnt this first-hand. Consultant or part time won’t wash. Experienced PMs are hugely valuable (and a frustratingly scarce commodity in today’s overheated Valley environment) Foundations can often dig deeper than non-profits to secure and retain the right PM talent for the duration of a full release or sprint sequence. Khan Academy recently demonstrated this fact hiring away the amazingly talented John Resig from Mozilla (Footnote: I do worry about sustainability of Khan’s hiring approach. With the perks starting to fly around the Valley, Khan’s hiring halo will begin to wear off and they need to plan and budget accordingly.)
There is no substitute for a full time UX either! – One of the first casualties of almost all development processes is the UX. It’s a real discipline and so often falls victim to other priorities or the tyranny of budget and “deliverables” (this is why I LOVE UX and Agile development process and why I advocate for it at the Foundation!) 1 UX + a strong PM can equate in many instances to a more traditionally staffed development cycle if it is tightly coupled to developers
Federate but know where to tightly couple – and in those instances where you need to tightly couple, marry your PTO resources with the necessary resources and a sequence of clearly delineated sprints
Let developers stay in touch with the code and see the impact of their contribution – this is where thoughtful use of control systems like Subversion can make or break a collaboration. Foundations will have little to no idea of how to manage a community and its contributions. Benetech – and talented souls like Gerardo – will
Please ping me with any refinements to the list above. I hope to have Jim and Gerardo up to Seattle to discuss a collaboration around their new venture and would appreciate any thoughts you might have on how Gates could contribute to a meaningful developer experience coding4good.
Shorter was a major forker. You’ll find traces of Monk, Gillespie and Davis but also paths toward Dolphy, Kirk and later Davis. 1961 and Shorter was laying out a new version of the menu. As Shorter reveals on the original liner notes: “I was thinking of misty landscapes with wild flowers and strange, dimly-seen shapes – the kind of places where folklore and legend are born. And I was thinking of things like witchburnings too…”
From Seth and the folks at SoundstageDirect: “Just thirty-one at the time of this 1964 classic, Wayne Shorter had just recorded Night Dreamer and Ju-Ju for Blue Note within the past year and was at one of his major peaks of creativity. The music on Speak No Evil, which includes such future Shorter standards as Fe-Fi-Fo-Fum and Infant Eyes, is often beyond description for it combines the best elements of hard bop, post bop, free jazz and modal music plus Shorter’s own individual approach.”
In an earlier post I wrote about the emergence of a new UX for personalized learning called Learning Maps and how they will sit at the nexus of content, performance data and diagnostics for individual learners and study groups. One of the enabling components which will make that nexus possible is a more consistent approach to metadata and metadata’s contextual wrapper, paradata, (a topic I just blogged on and which @stevemidgley is a key sponsor of at DOE)
Recently @BrandtRedd and I have collaborated to underwrite a partnership between the Association of Education Publishers and Creative Commons on their publication of a lightweight extension to schema.org – the Open Search collaborative recently launched by Google Bing and Yahoo and which is intended to accelerate markup of the web’s pages in ways recognized by the major search providers. In an exercise in mutual self-interest, it is our hope that early adoption of a schema extension can drive an improvement in the search experience for educational resources while giving OER and commercial publishers sufficient incentive to stay the course as a result of the improved UX the extension will help them to deliver to their customers (and yes, in some cases advertisers, as a result).
Our investment in a new and lightweight schema represents one of the 4 building blocks necessary to create a vibrant, competitive market of high-quality resources for personalized learning (the other three being learning maps, data and identity interop, and APIs for learning orchestration)
You can read more about the extension effort here and I will be blogging shortly on the improvements in UX we can expect as a result of the introduction of schema.org and its purposeful leveraging of HTML5 and CSS3
Here we are jamming away earlier this week on a workshop Gates co-hosted with rockstar computergrid maven Ian Foster and the Computation Institute at the University of Chicago. I’ve listed the attendees and their expertise at the end of this post so you have a sense for the mix in the room.
We gathered with 4 objectives in mind:
Here’s some of the highlights of where we came out, and good news is that after 6 intensive hours we DO have our marching orders to get cracking on a demonstration project (more to come on that topic in future posts):
Paradata matters – strictly speaking a class of metadata that captures information about the process used to collect data or each observation in the data. Used thoughtfully paradata can expand the range of data types you capture to include thereby enriching the data types you have to work with and the inferences you can derive from their analysis. This is one of the core principles behind Steve Midgley’s work at DOE. Put another way: paradata gives you richer context
Enable analysis of dataflow rather than data – data is too static a concept. We need to start thinking in terms of mining dataflow – kids these days traverse informal and formal learning spaces at increasing speed and frequency. Educational researchers are still stuck struggling to get access to 2 year old end-of-year test data! In order for personalized learning to be made actionable at point of service, we need to be able to better track the flow of data for a struggling individual (subject to security and privacy etc.) If we can do it for Medicine, why not Education? You think my sperm count is any less sensitive than how I am doing in 5th grade math!? Wait a minute, that didn’t come out right….
There’s a whole new market for services waiting to emerge – from recommendation and predictive services to content aggregation and capability measurement. Hard to predict what will actually succeed with teachers, kids and parents, but clear that there is a rich group of services that can save teachers time and actually help kids and parents get a handle on why and where they are struggling. Socos which is led by Vivienne Ming is one exciting example of an early start-up in this space
Trust is earned a recommendation at a time – as potential users service providers need to quickly establish some level of trust in terms of their ability to support us and secure our repeat business. That trust needs to be formed as early in the transaction process as possible. Netflix, iTunes and Amazon all demonstrate the power of recommendations. However, to really convert you need to provide context, and that’s where most of the current consumer services still fall short. Why is this resource being recommended to me now? What is the recommendation based on? Are there alternatives I might want to consider? Were they factored in before this choice was prioritized? The nagging feeling I have here is that the consumer engines actually have the ability to do that now, but choose not to for fear of freaking us out completely in a Big Brother way. This is why we desperately need Diaspora or similar concept to gain traction soon so we can all get our heads around what it means to own and manage a persona and avoid becoming a gadget
Current approaches to data privacy may be barse ackward– researchers at Microsoft Research are currently pursuing some hard-core work around the concept of Differential Privacy which asserts that “achieving differential privacy revolves around hiding the presence or absence of a single individual” What’s cool about this (and I in no way profess to understand all the math behind it completely) is that “sharper upper and lower bounds on noise required for achieving differential privacy against a sequence of linear queries can be obtained by understanding the geometry of the query sequence” Which in other words means that sufficient noise can be introduced into any given query in order to render it essentially private. Match this with point of service permissioning based on access rights and you have a much more robust and scalable approach to enabling researcher access to data that does not require months and years of paper application processing. For more on this, and the source of the above quotes please check out Cynthia Dwork’s paper in the Communications of the Association for Computing Machinery
The current IRB process needs mending – that’s Institutional Review Board to you. The groups that exist to protect the rights and welfare of research subjects. They have the power to reject or approve any and every aspect of a research request. The result – rather like the horrific Patent Process we are subject to in the US – is a humungous backlog of requests and a byzantine review and approval process. With the best of intentions we have managed to create a system that is choking the life out of the very research it is meant to enable. And heaven help you if your request cuts across more than one industry or IRB.
Looking forward to sharing more details as we progress on this area. For now here is a list of the folks my colleagues and I were lucky enough to work with that day:
Ian Foster (Argonne National Laboratory and University Chicago, Mathematics and Computer Science)
(University of Chicago Urban Education Institute, Education data research and policy)
Stacy Ehrlich (University of Chicago)
Connie Yowel (MacArthur Foundation, Public Education and Digital Media)
An-Me Chung (MacArthur Foundation)
Ken Koedinger (Carnegie Mellon, Computer Science, Learning Analytics and Cognitive Psychology)
Steve Midgley (Office of Education and Technology, Department of Education, Data interoperability and Online learning)
Helen Taylor Martin (UT Austin, College of Education, Linguistics, Psychology and Classroom Technologies)
Vivienne Ming (Socos, Cognitive Modelling and Predictive Analytics)
Roy Pea (Stanford School of Education, Learning Sciences and Education)
Armistead Sapp (SAS Institute, Software development, Data and Analytics)
Daniel Schwartz (Stanford School of Education, Instructional Methods, Teachable Agents, Cognitive Neuroscience)
John Palmer (Applied Minds, Computer Science and Mathematics)
Tony Hey (Microsoft Research, Technical Computing)
Gary West (CCSSO, Education Information Systems and Research)
Mark Luetzelschwab (Agilix, Education Technology & Systems Interoperability)
Alex Szalay (John Hopkins University)