A blog about Open Source, my work at the Gates Foundation and those I am fortunate enough to collaborate with
You can scroll the shelf using ← and → keys
You can scroll the shelf using ← and → keys
In an earlier post I wrote about the emergence of a new UX for personalized learning called Learning Maps and how they will sit at the nexus of content, performance data and diagnostics for individual learners and study groups. One of the enabling components which will make that nexus possible is a more consistent approach to metadata and metadata’s contextual wrapper, paradata, (a topic I just blogged on and which @stevemidgley is a key sponsor of at DOE)
Recently @BrandtRedd and I have collaborated to underwrite a partnership between the Association of Education Publishers and Creative Commons on their publication of a lightweight extension to schema.org – the Open Search collaborative recently launched by Google Bing and Yahoo and which is intended to accelerate markup of the web’s pages in ways recognized by the major search providers. In an exercise in mutual self-interest, it is our hope that early adoption of a schema extension can drive an improvement in the search experience for educational resources while giving OER and commercial publishers sufficient incentive to stay the course as a result of the improved UX the extension will help them to deliver to their customers (and yes, in some cases advertisers, as a result).
Our investment in a new and lightweight schema represents one of the 4 building blocks necessary to create a vibrant, competitive market of high-quality resources for personalized learning (the other three being learning maps, data and identity interop, and APIs for learning orchestration)
You can read more about the extension effort here and I will be blogging shortly on the improvements in UX we can expect as a result of the introduction of schema.org and its purposeful leveraging of HTML5 and CSS3
Here we are jamming away earlier this week on a workshop Gates co-hosted with rockstar computergrid maven Ian Foster and the Computation Institute at the University of Chicago. I’ve listed the attendees and their expertise at the end of this post so you have a sense for the mix in the room.
We gathered with 4 objectives in mind:
Here’s some of the highlights of where we came out, and good news is that after 6 intensive hours we DO have our marching orders to get cracking on a demonstration project (more to come on that topic in future posts):
Paradata matters – strictly speaking a class of metadata that captures information about the process used to collect data or each observation in the data. Used thoughtfully paradata can expand the range of data types you capture to include thereby enriching the data types you have to work with and the inferences you can derive from their analysis. This is one of the core principles behind Steve Midgley’s work at DOE. Put another way: paradata gives you richer context
Enable analysis of dataflow rather than data – data is too static a concept. We need to start thinking in terms of mining dataflow – kids these days traverse informal and formal learning spaces at increasing speed and frequency. Educational researchers are still stuck struggling to get access to 2 year old end-of-year test data! In order for personalized learning to be made actionable at point of service, we need to be able to better track the flow of data for a struggling individual (subject to security and privacy etc.) If we can do it for Medicine, why not Education? You think my sperm count is any less sensitive than how I am doing in 5th grade math!? Wait a minute, that didn’t come out right….
There’s a whole new market for services waiting to emerge – from recommendation and predictive services to content aggregation and capability measurement. Hard to predict what will actually succeed with teachers, kids and parents, but clear that there is a rich group of services that can save teachers time and actually help kids and parents get a handle on why and where they are struggling. Socos which is led by Vivienne Ming is one exciting example of an early start-up in this space
Trust is earned a recommendation at a time – as potential users service providers need to quickly establish some level of trust in terms of their ability to support us and secure our repeat business. That trust needs to be formed as early in the transaction process as possible. Netflix, iTunes and Amazon all demonstrate the power of recommendations. However, to really convert you need to provide context, and that’s where most of the current consumer services still fall short. Why is this resource being recommended to me now? What is the recommendation based on? Are there alternatives I might want to consider? Were they factored in before this choice was prioritized? The nagging feeling I have here is that the consumer engines actually have the ability to do that now, but choose not to for fear of freaking us out completely in a Big Brother way. This is why we desperately need Diaspora or similar concept to gain traction soon so we can all get our heads around what it means to own and manage a persona and avoid becoming a gadget
Current approaches to data privacy may be barse ackward– researchers at Microsoft Research are currently pursuing some hard-core work around the concept of Differential Privacy which asserts that “achieving differential privacy revolves around hiding the presence or absence of a single individual” What’s cool about this (and I in no way profess to understand all the math behind it completely) is that “sharper upper and lower bounds on noise required for achieving differential privacy against a sequence of linear queries can be obtained by understanding the geometry of the query sequence” Which in other words means that sufficient noise can be introduced into any given query in order to render it essentially private. Match this with point of service permissioning based on access rights and you have a much more robust and scalable approach to enabling researcher access to data that does not require months and years of paper application processing. For more on this, and the source of the above quotes please check out Cynthia Dwork’s paper in the Communications of the Association for Computing Machinery
The current IRB process needs mending – that’s Institutional Review Board to you. The groups that exist to protect the rights and welfare of research subjects. They have the power to reject or approve any and every aspect of a research request. The result – rather like the horrific Patent Process we are subject to in the US – is a humungous backlog of requests and a byzantine review and approval process. With the best of intentions we have managed to create a system that is choking the life out of the very research it is meant to enable. And heaven help you if your request cuts across more than one industry or IRB.
Looking forward to sharing more details as we progress on this area. For now here is a list of the folks my colleagues and I were lucky enough to work with that day:
Ian Foster (Argonne National Laboratory and University Chicago, Mathematics and Computer Science)
(University of Chicago Urban Education Institute, Education data research and policy)
Stacy Ehrlich (University of Chicago)
Connie Yowel (MacArthur Foundation, Public Education and Digital Media)
An-Me Chung (MacArthur Foundation)
Ken Koedinger (Carnegie Mellon, Computer Science, Learning Analytics and Cognitive Psychology)
Steve Midgley (Office of Education and Technology, Department of Education, Data interoperability and Online learning)
Helen Taylor Martin (UT Austin, College of Education, Linguistics, Psychology and Classroom Technologies)
Vivienne Ming (Socos, Cognitive Modelling and Predictive Analytics)
Roy Pea (Stanford School of Education, Learning Sciences and Education)
Armistead Sapp (SAS Institute, Software development, Data and Analytics)
Daniel Schwartz (Stanford School of Education, Instructional Methods, Teachable Agents, Cognitive Neuroscience)
John Palmer (Applied Minds, Computer Science and Mathematics)
Tony Hey (Microsoft Research, Technical Computing)
Gary West (CCSSO, Education Information Systems and Research)
Mark Luetzelschwab (Agilix, Education Technology & Systems Interoperability)
Alex Szalay (John Hopkins University)
There are 7 core use cases that we believe such a map can help us address:
All one requries to get started is a set of learning objectives in a machine-readable format. The author of those objectives would also have the option at publication of describing the relationship between them thereby enabling a rendering of the progression from topic to topic. This is but an initial assertion. Evidential probability analysis would help true learning pathways and relationships between objectives emerge over time as more and more people lay down paths through the subject area. If one wanted to get truly funky, one could leverage an arcane markup like PROWL to weight the relationships between objectives allowing for further differentiation and customization to an individual’s learning patterns.
So what might the rest of the recipe for a learning map look like? Here’s my guestimate:
Coupled with these basic ingredients we would also require some transactional web service capabilities to support the feedback loops and uses I listed earlier. In rough increasing order of complexity those would include:
So what might one of these maps actually look like. Figure 2 below shows an example. It was created by Larry Berger and Laurence Holt of Wireless Generation and provides an exciting sample visualization of a learning map for the Common Core Math Standards that could be built using the basic ingredients described above. Larry and Laurence write: “Known as “the honeycomb,” this application would be interactive and display a student’s progress through the standards. Each hexagon represents a single skill or concept, and groups of hexagons reflect the clusters of skills and concepts that together make up a standard. Drawing on the data infrastructure of the SLI, such a map could track a student’s progress, with cells turning from red to yellow to green as he mastered components of the standards. The slider on the left side of the screen would allow the student or his teacher to zoom in on the cells, which would display more granular information and links to aligned content and diagnostic assessments to help the student continue to move ahead. Individual student maps could roll up to classroom maps, classrooms could be aggregated to school maps and so on, up to the district and state levels.”
Figure 2: Visualization of the Common Core Learning Map
I am interpreting Larry and Laurence to be describing a visualization of raw XML in a native app or downloadable client. It would not be hard to add to their list of features the URI links and a way to express sequencing between the hexagons. One could imagine a service event being triggered either by onMouseover or when a student actually “shows up” in that hexagon, i.e. data informs the map of the student’s new location. The one tricky part would be brokering the link between a URI describing a hexagon in the visualization from the app, over a firewall, through a service broker, through a proxy or two, over another firewall and into a publisher’s digital content server where a relevant resource is then retrieved. Steve Midgely and his team are on of the groups working to tackle that problem, which is great, because its really tricky and Steve is really smart.
The visualization is exciting for the possibilities it represents and its intuitive UI. However it is also limited by its medium. For example, there is no reason why the same functionality described in the visualization could not also be accomplished using a combination of the APIs contained in the new HTML5, along with the design advances in CSS3 and performance gains we’re
seeing from a Java framwork like JQuery or Python. Such an approach would also have advantages and afford additional options over a more traditional app approach, namely:
Based on my early inspection of HTML5 and its API set I believe we can build an open and extensible Learning Map Web app, and that it would be the sort of project that would lend itself well to the Open Community to sustain. However, we would still need some solution to the earlier list of system capabilities required to support such an app, namely: integration with a student datastore so learners can be mapped; a way to link URIs across servers; a way to broker services between URIs; and finally, a way to build engines underneath the map capable of supporting adaptive tutoring and diagnostics.
Of these issues, the first could be dealt with by some form of federated datastore . The last will be dealt with using a combination of datamining, Bayesian analytics and scalable machine learning algorithms like Apache Mahout, or with integrated approaches from commercial providers like SAS or IBM which can now couple Watson’s capabilities to its recently acquired SPSS program.
So that leaves URI linking and transaction services. To solve that problem one could take advantage of Steve’s Registry and its elegant NNTP-like approach. Here’s an illustration of the Registry’s Transport Network:
There are some questions we need to answer before going with an HTML/script-based approach to produce an actual navigable map:
I hope this paper has been easy to follow and I would greatly appreciate hearing your reaction to it.