• Doyle, Dec. 6: An extensive archive of Raymond Chandler’s unpublished drafts of fantasy stories. $60,000 to $80,000.
    Doyle, Dec. 6: RAND, AYN. Single page from Ayn Rand’s handwritten first draft of her influential final novel Atlas Shrugged. $30,000 to $50,000.
    Doyle, Dec. 6: Ernest Hemingway’s first book with interesting provenance. Three Stories & Ten Poems. $20,000 to $30,000.
    Doyle, Dec. 6: Hemingway’s second book, one of 170 copies. In Our Time. $15,000 to $25,000.
    Doyle, Dec. 6: A finely colored example of Visscher’s double hemisphere world map, with a figured border. $12,000 to $18,000.
    Doyle, Dec. 6: Raymond Chandler’s Olivetti Studio 44 Typewriter. $10,000 to $20,000.
    Doyle, Dec. 6: Antonio Ordóñez's “Suit of Lights” owned by Ernest Hemingway. $10,000 to $20,000.
    Doyle, Dec. 6: A remarkable Truman archive featuring an inscribed beam from the White House construction. $8,000 to $12,000.
    Doyle, Dec. 6: The fourth edition of Audubon’s The Birds of America. $8,000 to $12,000.
    Doyle, Dec. 6: The original typed manuscript for Chandler’s only opera. The Princess and the Pedlar: An Entirely Original Comic Opera. $8,000 to $12,000.
    Doyle, Dec. 6: A splendidly illustrated treatise on ancient Peru and its Incan civilization. $7,000 to $10,000.
    Doyle, Dec. 6: A superb copy of Claude Lorrain’s Liber Veritatis from Longleat House. $5,000 to $8,000.
  • Swann, Nov. 21: Lot 37: Archive of the pioneering woman artist Arrah Lee Gaul, most 1911-59. $3,000 to $4,000.
    Swann, Nov. 21: Lot 66: Letter describing the dropping water level at Owens Lake near Death Valley, long before it was drained, Keeler, CA, 26 July 1904. $3,000 to $4,000
    Swann, Nov. 21: Lot 102: To Horse, To Horse! My All for a Horse! The Washington Cavalry, illustrated Civil War broadside, Philadelphia, 1862. $4,000 to $6,000
    Swann, Nov. 21: Lot 135: Album of cyanotype views of the Florida panhandle and beyond, 224 photographs, 174 of them cyanotypes, Apalachicola, FL and elsewhere, circa 1895-1896. $1,200 to $1,800
    Swann, Nov. 21: Lot 154: Catalogue of the Library of the United States, as acquired from Thomas Jefferson, Washington, 1815. $15,000 to $25,000
    Swann, Nov. 21: Lot 173: New Englands First Fruits, featuring the first description of Harvard in print, London, 1643. $40,000 to $60,000
    Swann, Nov. 21: Lot 177: John P. Greene, Original manuscript diary of a mission to western New York with Joseph Smith, 1833. $60,000 to $90,000
    Swann, Nov. 21: Lot 243: P.E. Larson, photographer, Such is Life in the Far West: Early Morning Call in a Gambling Hall, Goldfield, NV, circa 1906. $2,500 to $3,500
    Swann, Nov. 21: Lot 261: Fred W. Sladen, Diaries of a WWII colonel commanding troops from Morocco to Italy to France, 1942-44. $3,000 to $4,000
    Swann, Nov. 21: Lot 309: Los mexicanos pintados por si mismos, por varios autores, a Mexican plate book. Mexico, 1854-1855. $2,000 to $3,000
    Swann, Nov. 21: Lot 8: Diaries of a prospector / trapper in the remote Alaska wilderness, 5 manuscript volumes. Alaska, 1917-64. $1,500 to $2,500.
  • Finarte, Nov 20-21: Alighieri, Dante - La Commedia, [col commento di Jacopo della Lana e Martino Paolo Nidobeato, curata da Martino Paolo Nidobeato e Guido da Terzago. Aggiunto Il Credo], 1478
    Finarte, Nov 20-21: Alighieri, Dante - La Commedia [Commento di Christophorus Landinus, edita da Piero da Figino. Aggiunte le Rime diverse; Marsilius Ficinius, Ad Dantem gratulatio], 1491
    Finarte, Nov 20-21: Lactantius, Lucius Coelius Firmianus - Opera, 1465
    Finarte, Nov 20-21: Alighieri, Dante - Le terze rime di Dante, 1502
    Finarte, Nov 20-21: Boccaccio, Giovanni - Il Decamerone. Di messer Giouanni Boccaccio, 1516
    Finarte, Nov 20-21: Giordano Bruno - Candelaio comedia del Bruno nolano achademico di nulla achademia; detto il fastidito. In tristitia hilaris: in hilaritate tristis, 1582
    Finarte, Nov 20-21: Petrarca, Francesco - Le cose volgari di Messer Francesco Petrarcha, 1504
    Finarte, Nov 20-21: Legatura - Manoscritto - Medici - Cosimo III de' Medici / Solari, Giuseppe - I Ritratti Medicei overo Glorie e Grandezze della sempre sereniss. Casa Medici..., 1678
    Finarte, Nov 20-21: Alighieri, Dante - La Divina Commedia di Dante Alighieri con varie annotazioni, e copiosi Rami adornata, 1757
    Finarte, Nov 20-21: Lot containing 80 printed guides and publications dedicated to travel and itineraries in Italy
  • Ketterer Rare Books
    Auction November 25th
    Ketterer Rare Books, Nov. 25:
    H. Schedel, Liber chronicarum, 1493. Est: € 25,000
    Ketterer Rare Books, Nov. 25:
    P. O. Runge, Farben-Kugel, 1810. Est: € 8,000
    Ketterer Rare Books, Nov. 25:
    W. Kandinsky, Klänge, 1913. Est: € 20,000
    Ketterer Rare Books
    Auction November 25th
    Ketterer Rare Books, Nov. 25:
    W. Burley, De vita et moribus philosophorum, 1473. Est: € 4,000
    Ketterer Rare Books, Nov. 25:
    M. B. Valentini, Viridarium reformatum seu regnum vegetabile, 1719. Est: € 12,000
    Ketterer Rare Books, Nov. 25:
    PAN, 10 volumes, 1895-1900. Est: € 15,000
    Ketterer Rare Books
    Auction November 25th
    Ketterer Rare Books, Nov. 25:
    J. de Gaddesden, Rosa anglica practica medicinae, 1492. Est: € 12,000
    Ketterer Rare Books, Nov. 25:
    M. Merian, Todten-Tanz, 1649. Est: € 5,000
    Ketterer Rare Books, Nov. 25:
    D. Hammett, Red harvest, 1929. Est: € 11,000
    Ketterer Rare Books
    Auction November 25th
    Ketterer Rare Books, Nov. 25:
    Book of hours, Horae B. M. V., 1503. Est: € 9,000
    Ketterer Rare Books, Nov. 25:
    J. Miller, Illustratio systematis sexualis Linneai, 1792. Est: € 8,000
    Ketterer Rare Books, Nov. 25:
    F. Hundertwasser, Regentag – Look at it on a rainy day, 1972. Est: € 8,000

Rare Book Monthly

Articles - September - 2023 Issue

Where Do AI Programs Get Their Data? It Turns Out Some Comes from Copyrighted Books, Without Permission

CatGPT?

CatGPT?

Where does the information you get from artificial intelligence (AI) sources like ChatGPT come from? It comes from a lot places, including the reams of data on the internet, but a significant source is books. Many, if not most, are of recent vintage as up-to-date information is needed for best answers. As such, most of these books are under copyright. However, the authors and publishers of these books have not been asked for permission nor compensated. Is this legal, an acceptable use of copyrighted works, or a violation of copyright law? Good question. No one knows the answer since it has not been adjudicated in court.

 

AI programs gain a lot of their data, and learn how language is used so they can give understandable answers, from training databases. These are databases filled with an enormous amount of information. How about the best known AI program, ChatGPT? Did it learn from a training database? To answer this, we went to the ultimate authority to ask, ChatGPT itself. It responded, “Yes, ChatGPT, like other GPT-3 models, is trained on a large and diverse dataset containing a wide range of text from the internet. This dataset includes books, articles, websites, and other sources of human-generated text. The model learns patterns, language structures, and information from this training data, which it then uses to generate responses to user inputs.”

 

One such online training database is called “The Pile,” and a subset of The Pile is Books3. The Pile contains data from numerous sources, with Book3 providing the book element. It contains 196,000 books, converted to searchable text. It is not necessarily in a format that would allow you to read it as a book, but the text is there. Most are likely copyrighted but used without permission. It was freely available on the internet to anyone seeking to build an AI model. Its creator made it so, as he wanted even small developers to have a shot at creating a model.

 

Books3 was recently removed from the internet. It was taken down after Rights Alliance, a group representing Danish publishers, made the request. They determined that 150 titles used were published by their members. The Eye, the website hosting Books3, complied.

 

This issue is already starting to appear in court and we can expect to see more of this until some sort of decision is reached on where AI training databases and copyright law intersect. It is argued that this is “Fair Use,” a doctrine that allows you to quote brief parts of a book without running afoul of copyright law. This can be argued to be similar, without even direct quoting. It is sort of like conducting research in a library. However, it is also true these databases have copied entire books to do their searching. It is also notable that the authors are not being compensated, while at risk of losing sales to people who would rather do their research through services like ChatGPT. Of course, the database compiler can license the material from the publisher, but that would require many deals with many people, and it might be prohibitively expensive for all but the largest corporations. That is what the Books3 founder sought to avoid. Maybe ChatGPT can come up with an answer to this dilemma.

 

 

Note on illustration. What the...? I asked ChatGPT's image generator for a picture of ChatGPT. This is what it gave me. Why? Who knows. Perhaps it has to do with the French word for “cat” being “chat,” but who knows what it's artificial mind was thinking. Hopefully, it's textual answers are a little better.


Posted On: 2023-09-01 12:11
User Name: PeterReynolds

Textual answers better? Not in my experience. I asked it for the chapter titles of a book which it knew how to find online, formatted as a numbered list. It would only give me a list of chapters that it felt ought to be in books of this type, not the ones in the particular book, despite being able to point me to where I could find and read the book online.


Rare Book Monthly

  • Gonnelli:
    Auction 55
    Antique prints, paintings and maps
    November 26st 2024
    Gonnelli: Stefano Della Bella, 23 animal plances,1641. Starting price 480€
    Gonnelli: Stefano Della Bella, Boar Hunt, 1654. Starting price 180€
    Gonnelli: Crispijn Van de Passe, The seven Arts, 1637. Starting price 600€
    Gonnelli: Giuseppe Maria Mitelli, La Maschera è cagion di molti mali, 1688. Starting price 320€
    Gonnelli: Biribissor’s game, 1804-15. Starting price 2800€
    Gonnelli: Nicolas II de Larmessin, Habitats,1700. Starting price 320€
    Gonnelli: Miniature “O”, 1400. Starting price 1800€
    Gonnelli: Jan Van der Straet, Hunt scenes, 1596. Starting Price 140€
    Gonnelli: Massimino Baseggio, Costantinople, 1787. Starting price 480€
    Gonnelli: Kawanabe Kyosai, Erotic scene lighten up by a candle, 1860. Starting price 380€
    Gonnelli: Duck shaped dropper, 1670. Starting price 800€
  • Sotheby's
    Fine Books, Manuscripts & More
    Available for Immediate Purchase
    Sotheby’s: J.R.R. Tolkien. The Lord of the Rings Trilogy. 11,135 USD
    Sotheby’s: Edgar Allan Poe. The Raven and Other Poems, 1845. 33,000 USD
    Sotheby’s: Leo Tolstoy, Clara Bow. War and Peace, 1886. 22,500 USD
    Sotheby’s: Sir Arthur Conan Doyle. Adventures of Sherlock Holmes, 1902. 7,500 USD
    Sotheby’s: F. Scott Fitzgerald. This Side of Paradise, The Great Gatsby, and Others, 1920-1941. 24,180 USD
  • Doyle, Dec. 5: Minas Avetisian (1928-1975). Rest, 1973. $8,000 to $12,000.
    Doyle, Dec. 5: Anna Vaughn Hyatt Huntington (1876-1973). Yawning Tiger, conceived 1917. $3,000 to $5,000.
    Doyle, Dec. 5: Robert M. Kulicke (1924-2007). Full-Blown Red and White Roses in a Glass Vase, 1982. $3,000 to $5,000.
    Doyle, Dec. 5: Pablo Picasso (1881-1973). L’ATELIER DE CANNES (Bloch 794; Mourlot 279). The cover for Ces Peintres Nos Amis, vol. II. $1,000 to $1,500.
    Doyle, Dec. 5: LeRoy Neiman (1921-2012). THE BEACH AT CANNES, 1979. $1,200 to $1,800.
    Doyle, Dec. 5: Richard Avendon, the suite of eleven signed portraits from the Avedon/Paris portfolio. $150,000 to $250,000.
    Doyle, Dec. 5: Robert Mapplethorpe (1946-1989). Flowers in Vase, 1985. $20,000 to $30,000.
    Doyle, Dec. 5: Edward Weston (1886-1958). Nude, 1936. $20,000 to $30,000.
    Doyle, Dec. 5: Edward Weston (1886-1958). Juniper, High Sierra, 1937.
    Doyle, Dec. 5: Steven J. Levn (b. 1964). Plumage II, 2011. $6,000 to $8,000.
    Doyle, Dec. 5: Steven Meisel (b. 1954). Madonna, Miami, (from Sex), 1992. $6,000 to $9,000.

Article Search

Archived Articles

Ask Questions