Posts Tagged 'corpus'

“Palabras clave” de dos discursos del rey Felipe de 2017 #CorpusLinguistics #Oratoria #Corpus

Gracias a las herramientas del programa Sketch Engine, he obtenido las palabras clave de dos discursos de Felipe VI. En la Lingüística de Corpus, una palabra clave es un término estadísticamente relevante: “keywords are those whose frequency is unusually high in comparison with some norm” (source)

Las palabras clave del discurso del 3 de octubre de 2017 son las siguientes:



Screen Shot 2017-12-25 at 21.12.10.png




Por su parte, las palabras clave del discurso del 24 de diciembre:

Screen Shot 2017-12-25 at 21.17.02.png


A la espera de un análisis más profundo de esos términos, saltan a la vista las prioridades de la Casa real.


New interface to search google books

This new interface to search Spanish Google Books has been created by Prof. Mark Davies (Brigham Young University). It allows you to search for more than 45 billion words (45,000,000,000) in the Spanish datasets from the year 1500 to 2000. (See also to search English Google Books (155 billion words), and the Corpus of Global Web-Based English , which can provide data on differences between dialects of English.)

This interface allows you to search the Spanish and English Google Books in many ways that are much more advanced than what is possible with the simple Google Books interface:
– You can search by word, phrase, substring, lemma, part of speech, synonyms, and collocates (nearby words).
– You can copy the data to other applications for further analysis, which you can’t do with the regular Google Books interface.
– And you can quickly and easily compare the data in two different sections of the corpus (for example, adjectives describing women or art or music in the 1960s-2000s vs. the 1870s-1910s).

Note, however, that what you see here is still an early version of the corpus (interface). New features will be added and corrections will be made over the coming months. Please feel free to take a five-minute guided tour (based on the American English dataset), which will show you the major features of the corpus:

A simple click for each query will automatically fill in the form for you, display the results from the 45 billion words of text from Spanish, and then provide links to the actual books at Google Books:

[mujer] [J*] (all forms of mujer ‘woman’ followed by an adjective)*%5D&w2=&wl=2&wr=2&r1=&r2=…

[hacer] * * [NN*] (any verb form related to hacer ‘to make, to do’) followed by two chains of characters, followed by a noun)*+*+%5BNN*%5D&w2=&wl=2&wr=2&r1…

[ir] [VPP*] (any verb form related to the verb ir ‘to go’, followed by a gerundive verb form)*%5D&w2=&wl=2&wr=2&r1=&r2=&…

[acabar] por [V*] (any verb form of the verb acabar ‘end up’, followed by por ‘by’, followed by a verb)*%5D&w2=&wl=2&wr=2&r1…

(Vía infoling)


  • 564.797 hits


Twitter profile