I have a Cosmo Communicator from Planet Computers. I got it primarily because I need a 'phone with a physical keyboard, and I need a 'phone with a physical keyboard for 2 main reasons: I prefer the tactile nature of physical keyboards, allowing me to look at the screen as I type, and I depend GNU/Emacs and org-mode for organising my life. Currently (and probably for the foreseeable future), I access GNU/Emacs through a terminal using the Termux Android app.
I recently figured out how to take voice notes that can be recorded directly into my org-mode set up, which reduces the time and effort I need to get something noted quickly.
Before now, if I wanted to take a note into my org-mode set-up, I
would have to first open the clam-shell 'phone, unlock it, start or
find a termux session, launch emacsclient
(which will launch a
new GNU/Emacs session if one isn't running already), find the
correct org-mode file and then the correct org-mode headline, and
only then start typing the note. Very labour intensive. I
understand that org-capture is designed to help with this, but I've
never been able to figure out how it's really supposed to be
used. Also – and I am open to correction here – I don't think
it helps until the point when I have an open GNU/Emacs window in
front of me, by which time I'm already at the third-last step in
that sequence.
What I outline below allows me to go to my 'phone's home screen, tap a widget, speak, and all the rest is done automatically.
Before embarking on this, it's important to note that my set up depends on "Google" being installed on my Android device, and it having been "Enabled". This means that the solution isn't fully Free Software. If someone knows of a Free Software speech-to-text implementation I could use instead, I will happily look into transferring over to it. At the time of writing, I have not found such a thing.
So, the sequence of set-up steps
Install F-Droid
This solution depends on Termux. Although Termux is available on Google Play, a recent change over there means that the developer isn't providing any more updates to Termux. If you're already using Termux from Google Play, this set-up will work I think, but if it doesn't, and for other reasons anyway, I recommend using Termux from F-Droid. In which case you need to install it. These instructions are not going to get into it, but the F-Droid site is a perfect resource for learning how to do that.
Install Termux, Termux:API and Termux:Widget
If you're using the Google Play version of Termux, you'll have to install the others from Google Play, too. Otherwise, install them all from F-Droid. You'll need..
- Termux as the main terminal utility.
- Termux:API to integrate the speech-to-text capture into Termux, and
- Termux:Widget to allow for invoking a small script from the Android home screen.
- Start Termux.
- From the Termux command line, run
pkg upgrade
to ensure that you have the latest version of all the packages that come with it. From the Termux command line, run…
pkg install emacs termux-api termux-tools jq
which will install GNU/Emacs,
jq
for processing JSON objects,termux-api
andtermux-tools
for the termux-related functionality.- Perform whatever GNU/Emacs setup you need to perform. This is up to your taste, as these instructions assume you're familiar with GNU/Emacs, for why else would this page interest you?
Perform whatever org-mode setup you need to perform. Again, I'm assuming you have an org-mode set up that you suits your way of working.
I have integrated my Org Agenda files with a revision control system for years. Currently it uses
git
, and if this is something that works for you, I recommend it. You can – of course – use an equally-good revision control system, like subversion, if that suits. However, where relevant, these instructions make use ofgit
.You should create a new org-mode file and add it to you
org-agenda-files
setting, so that any changes to it will be incorporated into your org-agenda calls. The value here is that your captured notes will go into it and your other files won't be affected.In Termux, create a directory to contain the script that will capture the spoken message:
$ mkdir -pv ~/.shortcuts/tasks
Create a new script in that directory. Call it what you want. However, the following is vitally important: make sure that the "shebang" is correct.
If you don't know, the "shebang" is the first line of a shell script that informs the calling shell what command to use to execute it. For Termux:Widget scripts to work, they must have a shebang, and the shebang must be correct. If neither of these is true, the script won't work, and you won't get any feedback helping you identify the problem.
To get the shebang correct, type the following command into a Termux shell:
$ which bash
This will return something like
/data/data/com.termux/files/usr/bin/bash
. The shebang, therefore will be the following on its own as the very first line in the script (not the second with, like, a blank first line; it has to be the very first line):#!/data/data/com.termux/files/usr/bin/bash
Whetever follows the exclamation mark is to be exactly what is returned by the
which bash
command.- After the shebang (of course, because the shebang has to be
the very first line), your script should do the following:
Capture spoken instructions using the following piped sequence of commands:
termux-dialog speech -t "Termux" | jq .text | sed 's/"//g'
termux-dialog speech -t "Termux"
presents a speech-capture dialogue with the title Termux. You can set the title to something that suits you. This will send a small JSON object tostdout
which contains a field call"text"
.jq .text
(don't forget the space between theq
and the.
) extracts the value of the"text"
field in the JSON object and sends it tostdout
sed 's/"//g'
strips"
characters from the output. This may be a little crude, but you're hardly going to speak double-quotes, are you?Your script will then send the output of the above sequence into your dedicated org-mode file for capturing these notes, with the appropriate context around it. For example (see below), my setup creates a new 2nd-level headline as a
TODO
item, and it sets theSCHEDULED
date cookie on that item to yesterday (so that it appears at the top of my agenda), and it sets the priority cookie to[#A]
.Your script will append all of this into the relevant file.
- If you're using a revision control system, then your script
should commit the new note into it so that it can be
propagated to where you need it. Again, see below for how I
use
git
for this.
Once you've the script written, set the permissions to allow for it to be executable with one of the following:
chmod -v +x ~/.shortcuts/tasks/<script_name> # to make it generally executable
or
chmod -v u+x ~/.shortcuts/tasks/<script_name> # to make it executable for the script's owner only
- Follow the instructions to set up Termux:Widget on your home
screen, which will present to you all the executable scripts in
~/.shortcuts/
and~/.shortcuts/tasks/
, which you can launch by tapping on them. Placing the script into~/.shortcuts/
will launch a terminal screen to run it, but placing it into~/.shortcuts/tasks
will cause it to run in the background, which is what you probably want.
Now, you should test the script from the command line to confirm it works, simply by calling it from the terminal prompt:
$ ~/.shortcuts/tasks/<script_name>
which will present the speech-entry dialogue screen, and after you have spoken, it will close and you will see the note captured into the org-mode file.
If that works, then you can test it from the home screen widget.
Once you have it working, then you may consider some of the other possibilities. See below for how I have implemented it, which does some other fancy things:
- I use keywords for different actions: "note" for org-mode notes,
"wiki" to perform a Wikipedia search (using
termux-open
), "duck" to perform a DuckDuckGo search and "locate" to perform an OpenStreetMap.org search. - I send feedback to the Android screen using the
termux-toast
command. I also capture output into a log file. - As I am a heavy user of GNU/Emacs'
--bg-daemon
mode, I useemacsclient
commands to instruct emacs to do other things when capturing the note, like invokingorg-agenda-list
, which refreshes my org agenda.
Finally, all my other GNU/Emacs instances (on my many computers at
home and at work) will automatically pick up the new note from the
git
repository. Whenever I refresh the agenda, I will see the new
note as an overdue CAPTURED
item, which will prompt me to do
something about it.
Have fun, and let me know if you see any faults with my set up.
My personal setup
~/.shortcuts/tasks/voice_command.sh
– Script to capture voice command.#!/data/data/com.termux/files/usr/bin/bash # Where the log outputs are to be sent. export LOG_FILE=${HOME}/tmp/widget-test.out # To convert a string into text suitable for a web query urlencode() { # urlencode <string> old_lc_collate=$LC_COLLATE LC_COLLATE=C local length="${#1}" for (( i = 0; i < length; i++ )); do local c="${1:$i:1}" case $c in [a-zA-Z0-9.~_-]) printf '%s' "$c" ;; *) printf '%%%02X' "'$c" ;; esac done LC_COLLATE=$old_lc_collate } # To text suitable for a web query into normal text urldecode() { # urldecode <string> local url_encoded="${1//+/ }" printf '%b' "${url_encoded//%/\\x}" } # To send a notification to the 'phone notify () { termux-toast -g top "${1}" echo "Notification ${1}" >> ${LOG_FILE} } # A debug notification/entry debug () { if [ "${DEBUG}" = "Y" ]; then notify "DEBUG: ${1}" fi } # Processing command-line options. export DEBUG=N while getopts "d" opt; do case ${opt} in d) # We want DEBUG output if [ "${DEBUG}" = "N" ]; then DEBUG=Y else set -x fi # if the command-line includes " -d -d" or "-dd" bash -x is used. ;; *) echo "Oops" exit 1 ;; esac done # NOTE is what is converted from speech to text, in text form export NOTE # LEAD_WORD is the first word of the note, which is used to instruct # this script export LEAD_WORD # ACTION is every thing after the LEAD_WORD export ACTION # Ask for the voice command. NOTE="$(termux-dialog speech -t "Termux" | jq .text | sed 's/"//g')" debug "NOTE is \"${NOTE}\"" # Separate out the lead work and the instruction LEAD_WORD="$(echo ${NOTE} | cut -d' ' -f1)" debug "LEAD_WORD is \"${LEAD_WORD}\"" ACTION="$(echo ${NOTE} | sed "s/^${LEAD_WORD} //")" debug "ACTION is \"${ACTION}\"" # Convert the lead word to lower case, making it easier to test for. LEAD_WORD="${LEAD_WORD,}" debug "LEAD_WORD is \"${LEAD_WORD}\"" # If the LEAD_WORD is "note", then this is an org capture # note. ("org", "org mode" and "capture" were too fluffy and failed a # lot) if [ "${LEAD_WORD}" = "note" ]; then debug "ACTION is \"${ACTION}\"" # org-capture.sh's own shebang is for /bin/bash, but that won't work # on Termux, so we invoke the script through Termux' bash. /data/data/com.termux/files/usr/bin/bash ${HOME}/eibhear_org/scripts/org-capture.sh "${ACTION}" >> ${LOG_FILE} notify "ORG Capture of \"${ACTION}\" complete" # If the LEAD_WORD is "duck", perform a DuckDuckGo search on the ACTION elif [ "${LEAD_WORD}" = "duck" ]; then export SEARCH_TERM="$(urlencode "${ACTION}")" debug "SEARCH_TERM is \"${SEARCH_TERM}\"" termux-open "https://duckduckgo.com/?q=${SEARCH_TERM}&t=termux-open" # If the LEAD_WORD is "wiki", perform a wikipedia search on the ACTION elif [ "${LEAD_WORD}" = "wiki" ]; then export SEARCH_TERM="$(urlencode "${ACTION}")" debug "SEARCH_TERM is \"${SEARCH_TERM}\"" termux-open "https://en.wikipedia.org/wiki/Special:Search?search=${SEARCH_TERM}&sourceid=termux-open" # If the LEAD_WORD is "locate", perform a openstreetmap.org search on the ACTION elif [ "${LEAD_WORD}" = "locate" ]; then export SEARCH_TERM="$(urlencode "${ACTION}")" debug "SEARCH_TERM is \"${SEARCH_TERM}\"" termux-open "https://www.openstreetmap.org/search?query=${SEARCH_TERM}" else notify "Can't parse \"${NOTE}\", so don't know what to do with it." fi
${HOME}/eibhear_org/scripts/org-capture.sh
– Script to capture an org-mode entry for later processing.#!/bin/bash # Processing command-line options. export DEBUG=N export DONT_COMMIT=N while getopts "dn" opt; do case ${opt} in d) # We want DEBUG output if [ "${DEBUG}" = "N" ]; then DEBUG=Y else set -x fi # if the command-line includes " -d -d" or "-dd" bash -x is used. ;; n) # We don't want to commit and push this note DONT_COMMIT=Y ;; *) echo "Oops" exit 1 ;; esac done # shift to the first non-option parameter. shift $(( ${OPTIND} - 1 )); unset OPTIND if [ "${DONT_COMMIT}" = "N" ]; then # fetch and update -- it doesn't really matter if this doesn't work # egit-update-org-agenda-files is a personal elisp utility I have # that updates the org-agenda-files from my git repository. emacsclient -e '(egit-update-org-agenda-files)' fi # Send the information into an agenda file. To get it to pop up to the # top of the agenda, set the scheduled date to yesterday. A stupid, # but effective, hack. echo "** CAPTURED [#A] (from org-capture.sh) ${1}" >> ${HOME}/eibhear_org/capture.org echo " SCHEDULED: <$(date -d '1 day ago' '+%F %a')>" >> ${HOME}/eibhear_org/capture.org # Revert all the agenda files and rebuild the # agenda. e-revert-org-agenda-file-buffers is another personal elisp # function I wrote to do this. emacsclient -e '(progn (e-revert-org-agenda-file-buffers) (org-agenda-list))' if [ "${DONT_COMMIT}" = "N" ]; then # Add the updated capture file git -C ${HOME}/eibhear_org add capture.org retcode=${?} # egit-get-alerter-func is a personal elisp function I use to get # the name of the function that will send alerts, as this will # differ from system to system (GNU/Linux, Windows, Android, # Sailfish OS, etc.) if [ ${retcode} -ne 0 ]; then emacsclient -e "(apply (egit-get-alerter-func) \"Org Capture\" (list \"Problem git-adding capture.org\"))" exit ${retcode} fi # Commit the updated capture file git -C ${HOME}/eibhear_org commit -m "A new note recorded through org-capture.sh" retcode=${?} if [ ${?} -ne 0 ]; then emacsclient -e "(apply (egit-get-alerter-func) \"Org Capture\" (list \"Problem git-committing new note\"))" exit ${retcode} fi # Push the updated capture file git -C ${HOME}/eibhear_org push retcode=${?} if [ ${?} -ne 0 ]; then emacsclient -e "(apply (egit-get-alerter-func) \"Org Capture\" (list \"Problem git-pushing new note\"))" exit ${retcode} fi fi # Notify of the completion of the capture and update the agenda. emacsclient -e "(progn (e-revert-org-agenda-file-buffers) (org-agenda-list) (apply (egit-get-alerter-func) \"Org Capture\" (list (format \"Note (%s) taken\" \"${1}\"))))"
capture.org
– Template org-mode file# For quick capture of notes. The TODO keywords are CAPTURED, denoting # that it was entered by the org-capture script, and TRANSFERRED, # denoting that the note has been transferred to another org-mode file # and therefore has been processed from here. #+TODO: CAPTURED | TRANSFERRED * TODOS -- Remove from here as each is moved to the respective target location :captured: ** TRANSFERRED [#A] (from org-capture.sh) call mother to wish her a happy birthday SCHEDULED: <2021-03-01 Mon> ** CAPTURED [#A] (from org-capture.sh) dentist appointment on the 25th at 1:30 SCHEDULED: <2021-03-03 Wed>
- Update
- Following Uwe's suggestion, I was able to confirm that this now works for me on Android 11. I've moved these addendums to the bottom of the post so that they are no longer in the way.
- Update
See the comment below from Uwe, who has been able to get this to work on Android 11.
I don't have an Android 11 device to hand right now, but once I do and get this working, I'll remove these updates and correct the post where necessary.
- Update
The
termux-dialog speech
command doesn't work on Android 11. Looking briefly into it, it seems that Google changed the speech-to-text API in Android 11, andtermux-api
hasn't adopted that change. It's not fair to call it a bug, as such, but until the termux guys get a chance to fix it, what follows can be described as not working on Android 11.agdad on github has raised an issue about this, but as I am not on github, I can't promote is as something I would like see fixed, too. I have brought it to the attention of the termux gitter/matrix room, but I don't know if it has been seen.
Therefore, if this is something of interest to you, and if you are on github or have some other way to bring this to the attention of the termux team, please let them know.
Once I learn that this works on Android 11, I'll update this post.
You can comment on this post below, or on the matrix room here. If you want, you can "Log in" using your [matrix] ID.
All comments are subject to this site's comment policy.