Hello Friends,
With reference to the Open Thesis, there
are many thorny issues. NDLTD has been irked by the uncivil/unethical practice
of harvesting not only the metadata but full text too (which is against the
basic tenet of OAI-PMH). There have been lots of discussions currently going on
and many issues that the NDLTD members have taken up. So far, we (NDLTD) have
not got convincing and satisfactory answers.
I reproduce one open letter sent by John Hagen,
a member of the NDLTD Board with Pete Celano
of OpenThesis – which I believe raises the pertinent issues- for the benefit of LIS Forum members.
-----
I need to clear the air, please allow me to rant.
After viewing all the ruckus the past few days on the ETD
listserv about OpenThesis, I am bothered by the fact that you (Patents Online
LLC, not necessarily you personally), did not follow the Open Access Initiative
(OAI) harvesting protocol as I had previously mentioned (you will
recall the Google/NDLTD story I told you when they wanted to do the same some
years ago). Harvesting the metadata is fine, however, I was not aware you
intended to harvest the documents themselves. This was not part of the
bargain.
OAI harvesting protocol is intended to harvest only the
metadata (bibliographic information) along with a pointer (URL) back to the
originating institution to provide document access. The documents
themselves are NOT harvested. There are a number of reasons why as
outlined below.
- Integrity: the originating institution
maintains copy (version) control. From time to time document versions may
be replaced for a variety of reasons. The only way to maintain authority
over versions is to always point back to the version at the originating
institution.
- Authentication: End-users know they are accessing
the genuine document when it is served via the originating institutions' IP
network / institutional repository.
- Acknowledgment: Branding and Access Statistics,
Copyright...
-- Branding: Institutions often have their unique
branding which accompanies the metadata and document file link appearing on the
Web screen of their institutional repository.
-- Access Statistics: Institutions maintain access log
statistics to report the popularity (via hit count) of documents
downloaded. This information is often published and enables graduate
alumni to document the success of their research by number of accesses to the
thesis or dissertation.
-- Copyright: This can vary from country to country,
but individuals as well as institutions can get very possessive about
intellectual property they produce. Granted OIA technically allows
robotic harvesting unless the admin embeds a "stop crawling
identifier", the basic tenets of civility worked out in the OAI protocol
means we don't just take things (i.e. digital objects) from others, especially
without asking first, much for the reasons specified above.
I will give you concrete examples of how this should be
done. I had previously cited VTLS Visualizer at http://www.vtls.com/ndltd.
See also Scirus ETD Search at http://www.ndltd.org/serviceproviders/scirus-etd-search.
These are listed as ETD search interfaces on the NDLTD page at http://www.ndltd.org/find.
You will notice that when you do a search on either system,
they index the basic information (correctly I might add), and point back to the
originating institution to provide document access. In many cases it
points back to a "document data" screen in the originating
institutions' IR, and provides a link to the document. In some cases, it
will take you directly to the document via a "Handle", also published
as part of the metadata. These services include not only metadata about
open access ETDs, but also include the metadata for restricted access ETDs as
well (even though the document may require login to access the file in the case
of a campus restriction, the metadata and the fact that the document exists is
indexed to reflect the entire holdings of the collection).
The solution I would recommend is that in OpenThesis, you
provide the ETD metadata, ensure it is accurate, and most
importantly, point back to the originating institution for document access; do
not provide file access from the copy you harvested on your server.
Rather than going out and grabbing IP just because you can,
then telling folks that authors and institutions can voluntarily upload their
IP, then they find out after the fact that you have already
harvested their IP without their permission, it seems to me
there must be a better approach than this. At least this is my perception
of what the NDLTD constituency have been so irked about since you made the
OpenThesis announcement. I am happy to work with you
This does not have to be adversarial, nor should it
be. But please, you need to be a team player. Let me know your
thoughts.
Thanks,
-
ps - Of course there is nothing prohibiting authors or
institutions from voluntarily uploading copies of ETDs in the OpenThesis
system, for reasons of recognition, royalty, online viewing convenience and/or
print-on-demand service , citation impact, preservation, etc.,
(barring local copyright legislation as some have cited regarding
institutions).
<><><><><><><><><><><><><><><><><>
Manager, WVU Institutional Repository Program / Coordinator,
Electronic Thesis & Dissertation Program
Board Member, NDLTD
Co-Chair, ETD
2009 Symposium
Acquisitions Department
Wise Library, Room 2510
http://www.wvu.edu/~thesis/
Executive Director and Professor
Phone : + 91
Fax
: + 91
ISiM -
From:
Sent: 04 February 2010 11:01
To:
Subject: [LIS-Forum] Open Thesis
Dear Members,
Now you may search, upload or download many of the full text of
theses/dissertations/other academic documents submitted to various universities
in all parts of the world in different subject areas using Open Thesis free
database. You may find more information and search this database at:
http://www.openthesis.org
Interested Institutions may participate in this project and get a collection
specific homepage.
With best Regards,
Deputy Librarian
Indira Gandhi Institute of Development Research
Gen
MUMBAI-400 065,
Phone: 91-22-2841 6547 ; 2840 0919 Ext: 547
E-mail: pujar@igidr.ac.in
Skype: sham.pujar
URL: http://www.igidr.ac.in
Blog: http://spujar.wordpress.com
--
This message has been scanned for viruses and
dangerous content by
MailScanner, and is
believed to be clean.
--
This message has been scanned for viruses and
dangerous content by
MailScanner, and is
believed to be clean.