Using linked Data to support Careers Advice, Information and Guidance
For some time, I have been working at developing a Technology Enhanced Boundary Object (TEBO) to help Careers Advisers (PAs) understand Labour Market Information (LMI). But I am increasingly interested in how we can access and visualise live LMI as part of the careers advice process. These are notes I have written about the idea.
What is linked data?
The Web enables us to link related documents (from linkeddata.org). Similarly it enables us to link related data. The term Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. Key technologies that support Linked Data are URIs (a generic means to identify entities or concepts in the world), HTTP (a simple yet universal mechanism for retrieving resources, or descriptions of resources), and RDF (a generic graph-based data model with which to structure and link data that describes things in the world).(Tom Heath, including excerpts from Bizer, Heath and Berners-Lee (in press))
What is the relationship between Linked Data and the Semantic Web?
Opinions on this topic differ somewhat, however a widely held view is that the Semantic Web is made up of Linked Data; i.e. the Semantic Web is the whole, while Linked Data is the parts. Tim Berners-Lee, inventor of the Web and the person credited with coining the terms Semantic Web and Linked Data has frequently described Linked Data as “the Semantic Web done right.”
Using Linked data with Careers PAs in the UK
Though the MATURE project we have undertaken extensive research and consultation with PAs in different Connexions companies including in England and Wales around the use of Labour Market Information in Careers Advice, Information and Guidance. Work undertaken through the project has aimed to allow research and easy access to documentation around different careers including LMI. We are also aware that all LMI requires interpretation – s stage of knowledge maturing – and one aim has been to allow easy forms of interpretation though tagging etc. A second aim has been to allow the development of an organisational knowledge base through sharing the results of LMI research. LMI is based on various data, collected by different government agencies and by for example the sector skills councils. In the past access to this data has been restricted. Additionally it requires considerable knowledge and skills to be able to manipulate and interpret large data sets. Inevitably much of the interpetation is over generalised and is frequently out of date.
Open Data
In autumn of 2009, a new web site was launched in the UK based on an initiative by Tim Berners Lee and Nick Shadbolt. Data.gov.uk seeks to give a way into the wealth of government data. As highlighted by the Power of Information Taskforce, this means it needs to be:
- easy to find;
- easy to license; and
- easy to re-use.
The aim is to publish government data as RDF – enabling data to be linked together. The web site says their approach is based on:
- Working with the web;
- Keeping things simple: we aim to make the smallest possible changes that will make the web work better;
- Working with the grain: we are not looking to rebuild the world. We appreciate that some things take time; others can be done relatively quickly. Everything has it’s own time and pace;
- Using open standards, open source and open data: these are the core elements of a modular, sustainable system; and
- Building communities, and working with and through them (both inside government and outside).
The new UK government has committed itself to backing this initiative and increasingly local government organisations are providing open access to data. Many of the key data sets for LMI are available through the data.gov.uk site including time series data on employment in different occupations, average earnings, job centre vacancies (at fine grained local office level and over a 10 year time series), qualifications, graduate destinations etc. along with more generalised but critical data such as post codes. All data can be queried in real time through a SPARQL interface.
Thus there is considerable potential to run queries and provide linked data providing valuable Labour Market and Careers information.
For instance:
A post code or location based query around a particular occupation could reveal:
- the average pay for that job
- job centre vacancies in that job over the past at a local level
By querying external databases this could be extended to include:
- iCould videos about that career (there are something like 1000 high quality videos available)
- Job description along with required qualifications
Where xcri course information data is available the app could provide information on local courses related to that career (Note – xcri data standard compliance is patchy in the UK).
Maturing Knowledge – the role of the PA
Whilst this system would be a great advance on anything presently available, it is not perfect. LMI data still requires interpretation. For instance job centre data has a known bias towards public sector employment, lower paid jobs and short term employment. The search only covers past data and may not reveal longer term labour market trends. Thus ideally following such a search the PA would be able to add brief notes before saving the search. These overall results could then be packaged to sent to a client as well as stored within the organisational system. To use the new information and knowledge sources being made available through the Careers Project requires new interpretation skills on behalf of the PAs. Thus the development of a linked data app would also be accompanied by the development of the TEBO which aims to provide informal learning for PAs around using LMI
Visualisations
Although a early version of the system might well be text based, it would enhance data interpretation to provide visualisations of the data.It may be possible to do this dynamically using for instance APIs to the IBM Open Source Many Eyes application.
I was involved in developing a similar project in Iceland about 8-9 years ago. The slight difference was that our project was more oriented toward educational advising. What we wanted to do was to link labour market info to our learning opportunities database. Like with your project, it seemed ideal to use a metadata framework (like RDF) to do this. Our major hurdle with this approach was that producers of LMI didn’t want to add the needed metadata to their databases. They just didn’t see any benefit for themselves in doing so. There were also other difficulties that ultimately rendered the project unfeasible, some of which you mention. For example, we experienced the same as you regarding the publicly available data on lower paid jobs vs. private data on higher paid and career path jobs. For the latter we would have to tap into professional career service firms’ data and they wanted too much in license fees for that.
Most of the problems we encountered could have been overcome in some way, ex. negotiating licensing fees etc.. The major hurdle, however, was the metadata issue.
Here’s what I would say that I learned:
Metadata presents a lot of interesting opportunities and these quickly become obvious when we are considering the info tech aspects. But, metadata needs to be generated and (so far, at least) it needs human interaction to do so if it’s to be accurate. Furthermore, metadata generation for specific uses like your project (and our former project) can’t be a free-for-all like contemporary tagging fads are. They need to be based on precise thesauri to ensure precision in matching data from more than one resource. It seems that this isn’t something that a lot of people are willing to do (even where standards exist). I got the sense that people feel like this is something that computers should do and believe that computers can do (I think that this belief is largely due to advancements in free-text search efficiency). But, metadata and the Semantic Web, have to do with meaning and this is still something that computers just can’t handle. So, while everyone thought it was a fascinating project and were quick to see the potential benefits, few were willing to invest in the work needed to see it through.
I also had some (albeit limited) involvement in the initial stages of the commission’s Ploteus project. There, I also emphasized the need for standards and metadata and got the same reaction as with our database project, i.e. the member states all had their databases already and were not very interested in changing their data formats for the sake of the project.
It’ll be interesting to see how your project goes and especially to see if, and how, attitudes have changed regarding the use of metadata and standards. It seems to me that this remains a significant hurdle to realising the many fascinating possibilities that a more semantic web offers.