How is Synthesis a natural language?

mentaloid · March 12, 2024, 6:38pm

DISCLAIMER: The following discussion:

is not reviewed by a certified linguist
- Rather than the accuracy of this writing, focus on its spirit.
depends on some heavily-contested definitions. It is meant:
- to provide the reasoning
- to set the expectations right
- not to persuade anyone

Short version

Synthesis is a natural language because:

it is natural to humans
- you can think anything directly in it
  - You don’t have to translate all the time back-and-forth from/to other languages.
    - no more pseudocode
it was designed and developed like that:
- It hides or downplays unnatural elements (addresses, operators, special notations etc.)
- It emphasizes natural elements (word-parts, phrases, redundancy etc.)

Long version

What is NOT a natural language:

found in nature (counter-intuitively)
- used by animals
  - natural to them, but not to humans
  - instinctive and usually non-intelligent
  - very limited to express anything complicated
  - crying-baby language belongs here
- molecular codes (DNA etc.)
  - natural to cells, but not to humans
  - totally impractical for human usage
  - not fully understood anyway
notations
- mathematical notation
  - only a few natural elements here and there
    - The rest is purposely codified to:
      - save space
      - make it international
  - arguably natural for some calculations
- musical notation
  - none of its flavors qualifies as natural enough
    - many musicians don’t even use it
  - almost useless beyond music
most programming languages
- machine languages
  - natural to machines, but not to humans
  - kin to molecular codes and equally impractical
- symbolic addressing languages
  - effectively refined machine languages
    - more practical, equally unnatural
- high-level languages
  - at high distance from machines, but not much closer to humans
  - kin to mathematical notation
    - some of them feel very familiar to mathematicians
    - very few natural elements
- domain-specific languages
  - kin to musical notation
  - some of them feel very natural, until some deviation is attempted
- most meta-languages
  - some of the above languages
  - They enable the creation of other languages, including natural languages.
    - But they do it in unnatural ways, so they don’t qualify as natural themselves.
- Comparison of Synthesis to other programming languages

What is NOT needed for a language to be considered natural:

to take no effort
- For a baby to learn a natural language (or more) is a natural process, but the effort remains enormous.
- Synthesis takes relatively little effort, but is no exception.
to be your native language
- Your native language is not the only natural language.
  - But you can bring Synthesis closer to your native language than with most other options.
to be an old language
- All natural languages were young at some point.
- There are chances that Synthesis is older than you.
to have a complicated grammar or other linguistic features
- Different natural languages have different features and levels of complexity.
- Some grammar is needed, but complexity is not a prerequisite.
- Synthesis’ simplicity is a designed feature.
to possess ambiguity
- an attractive optional feature, which facilitates some forms of poetry and other word-plays
- fundamentally unacceptable in executable code
  - In Synthesis there is always a single context-specific meaning.
to not be artificial
- All human languages are artificial, i.e. made up by humans.
  - As much artificial as Esperanto or medical terminology.
  - For crying-baby language, refer to animals above.
to be totally free from synthetic elements
- All practical written languages incorporate synthetic elements.
  - primarily their marks
  - Pureness is not an end in itself.
  - Natural languages are verbose.
    - Synthetic elements help shorten them.
      - Brevity is not an end in itself either.
      - Excessive brevity makes more harm than good.
  - Both balance and options are needed.
- There is an essential difference between:
  - a synthetic language with natural elements
    - e.g. mathematical notation
  - a natural language with synthetic elements
    - most contemporary natural languages

Examples of natural languages:

common languages and their dialects
- the most natural to humans
slang languages (street, sailors’, pilots’, soldiers’ etc.)
- natural to their users, not radically different from common languages
some special vocabularies or terminologies (e.g. scientific names)
- kin to notations, but made of natural elements (prefixes/suffixes etc.)
some sign languages
- not needing mouths/ears, yet very natural
- kin to domain-specific languages, but with universal application
- possible ancestors of written languages

What makes a language natural:

Its natural elements to be decisively more than its synthetic ones.
- Synthesis’ early versions did not satisfy this criterion.
  - But it has been designed and evolved to gradually smash it, in a bottom-up approach.
Being possible for humans:
- to learn to understand it, use it and extend it
  - without help (i.e. through experience) and in less than a few years
    - of course help can accelerate that process considerably
      - In Synthesis, much of this help is available on demand by the language itself.
        
        far beyond error messages or IDEs
- to think in it
  - This is the ultimate criterion, though not easily verifiable.
  - Reasoning in a different language than the one you write in is unnatural.
    - In humans, this results in:
      - low productivity
      - occasionally writing meaningful but unnatural sentences
        
        in respect to the written language
    - In case of LLMs, they write natural sentences that mean nothing to them.
      - [Comparison of Synthesis to modern AI] (topic under construction)
    - In Synthesis, all reasoning above the platform level is natural and visible.

Challenge

Which one of the following lines is natural?

let btn = Button.forTitle("click me"); btn.addHandler("click", myHandler)
(add_handler my_handler :click (button_for_title "click me"))
button for title "click me"; add click handler my-handler

qerty · March 30, 2024, 1:39pm

This seems quite interesting. But I think the idea of making a language easier by making it superficially look similar to human language is misguided. Why do I think this?

(Warning: This is pretty much just a disorganized dump of my ideas about making an intuitive programming language.)

For absolute beginners to programming, this might indeed be helpful. They have no prior mental model of what programming looks like, and if the language is composed of English words, you can get at least a vague sense of what they do. But that won’t get you very far. The language may be fairly easy to read, but as soon as you start writing it yourself, you will find that none of the principles you can use when talking to other human beings work here. The vocabulary may be similar, but you’ll still have to use an unambiguous grammar understood by a computer, which is and will always be completely different from using English.

Instead, I think there are much more effective ways to make a language useful for beginners:

Discoverability

As soon as you understand the basics (a program is a series of instructions / a transformation on a value / whatever), your biggest problem will become discovery. You find yourself having to constantly having look up how to do things, and having to remember just how you have to arrange these things so the machine doesn’t complain.

This is where visual programming environments have an inherent advantage: They can present to the user a list of things they can do, and the user can mix and match and play around as much as they like. Documentation can be reachable in a single button click, and presented in a visually engaging way. Grammar can be illustrated simply by using shapes that naturally fit together in certain ways, something humans are exposed to all the time in real life.

Now, visual programming systems often have the disadvantage of slowing experienced developers down. Their ease of use becomes a burden to more advanced users. I don’t think it necessarily has to be this way, and I think one could develop a visual language that is every bit as fast to navigate as our conventional text-based ones, but it certainly isn’t easy.

Neither do I think you have to choose between the two: You can have both be interchangeable views on the same abstract program, and let the user switch between them freely, or copy visual blocks as text to share with others.

One technology that may tremendously improve discovery is LLMs. The user can simply ask how to do what they want to do in actual natural language, and get a program back. Over time, as humans are pretty good at pattern matching, they will get some sense for what works and what doesn’t. LLMs could even be integrated right into the language as a powerful escape hatch for when a problem gets too difficult. For example (disregard the syntax please):

makeShakespeare = llm “Rewrite the following text as if it was written by Shakespeare.”
print (makeShakespeare “Be true to yourself.”)
// prints “To thine own self be true.”

Of course, this would require local LLMs to be performant and small enough so that they be shipped with the software to all users. Which is certainly not the case right now. But I’m optimistic for the future.

Arbitrary limitations / overly rigid structure

Another pervasive problem I see with programming is that even dynamic languages like Python are far more rigid than they need to be. This makes sense for software that is intended to be distributed, but why limit yourself in this way when you’ll always be the only user? It’s bothersome that we get beginners accustomed to these limitations so early that they are often unable to conceptualize programming without them.

Some examples of distinctions I think may be unnecessary:

The distinction between code and values, or between programs and their outputs.
- The output could simply be code itself, that would result in the same output.
- This would make even more sense in a visual programming language, where you could drag and drop values from anywhere to anywhere else, and visualize complex outputs in all sorts of ways (geographical maps, line charts, etc.)
The distinction between stored data and in-memory data. Why does storing stuff have to be so difficult?
- Making outputs be code already gets you halfway there.
  - For the other half, all you have to do is making running untrusted code safe. This is easy, as long as you limit introspection and don’t make effectful functions importable (instead, pass them in to the program at the entry point).
Boundaries between code versions, packages, modules etc.
- Code may instead be in a flat structure, not existing soely at one place, potentially content-addressed, with a powerful search engine and each user having their own “dictionary”.

Programming models

I do think some programming models, regardless of syntax, are more intuitive to some people than others. For most people, I’d say imperative programming is the most natural. It’s easy to explain that a program is simply a series of instructions to follow. On the other hand, the referential transparency inherent to functional programming avoids a lot of confusion about reference and value types, equality and the likes.

Prior art I highly recommend checking out for inspiration

Embark by Ink and Switch
Glamorous Toolkit
ScrapScript
Scratch

mentaloid · March 31, 2024, 11:42pm

Welcome. Thank you for the opportunity to clarify a few things:

Discoverability is totally irrelevant to the language.
- It is rather an aid provided by the environment.
  - Synthesis lab already has its own set of such aids and many more can and will be developed.
Synthesis:
- is not trying to be English
  - No matter that it sometimes looks like English.
    - and could be made to look much more like it
    - This particular similarity is indeed superficial.
      - But makes it familiar to read.
    - The deep similarity is in being natural.
      - That makes it familiar to think in it.
  - The above article doesn’t even mention English.
  - Compared to English, Synthesis is on purpose:
    - much simpler
      - not made to write Shakespeare
        
        Let this to LLMs, which are good with fiction.
    - much easier
      - not requiring to write like Shakespeare
        
        Let this to LLMs, which can easily make salads.
    - much lighter
      - not requiring a local archive of every English text corpus
        
        Let this to LLMs, which are quantity-first par excellence.
- can use the vocabulary of any language
  - and of multiple languages at the same time
    - without mixing the contexts
  - though cannot use their grammar
- rarely complains about anything (despite some exceptions)
  - That is because it has limited expectations.
    - Not only less than other programming languages, but also less than humans.
      - Human languages are very forgiving, but humans are not.
        
        If you try to give complicated instructions to humans, their complaints compete those of computers.
        
        Detailed tutorials exist to clarify what the languages could not.
      - I receive more complaints from humans than from Synthesis.
        
        Of course some humans are bigger complainers than others.
      - Worse than complaining is misunderstanding.
        
        The convenience of guessing cannot outweigh the problems from guessing wrongly.
        
        Human languages are disqualified, because their ambiguity costs lives.
        
        e.g. the order given to the “light brigade”
What you wrote applies to French more than to Synthesis.
- For historical reasons, English and French:
  - share much vocabulary
  - use very different grammars
- It is easier from English to move to Synthesis than to move to French.
  - Not to mention other human languages, which are more alien to English than Synthesis is.
- All these languages are both:
  - different
    - There is no easy way to interchange them.
  - natural
    - Humans can think directly in them.
Visual programming:
- is not about tools that somehow write code for you
  - They may as well write in Synthesis.
  - Therefore, I’m not going to touch this category.
- is about programming in a visual way
  - like the mentioned Scratch
- shares the characteristics of domain-specific languages:
  - Can be dreamlike in narrow applications.
  - The dream breaks on first deviation.
- on the positive side, it is:
  - easy to learn
    - Because we are visual learners.
  - easy to write
    - Because we don’t have to type.
      - Though typing can be faster.
  - great for teaching
    - Because it is like a game.
- on the negative side, it:
  - is not easy to read
    - You need to translate the whole program into your own language.
      - Unless you are the person who put the shapes together.
        
        i.e. already done the inverse translation
  - doesn’t scale at all
    - By now experience has shown that visual programming is impractical for anything beyond playing.
    - The same attributes make visual programming of:
      - smaller programs easy
        
        particularly fun to debug
      - bigger programs a nightmare
        
        especially hard to reason about
    - Fitting preshaped pieces is nice, but:
      - is dead-on-arrival (read the Long answer in Redundancy in Synthesis)
      - although it prevents many errors, it doesn’t prevent the most painful ones
        
        Effective prevention should involve the mind more than the hands.
        
        People are notorious for getting frustrated with limitations and wanting to break stuff.
        
        Learning through mistakes works better.
      - it turns programming into a puzzle
        
        initially a fun challenge
        
        later a boring unproductive activity
        
        can bound creativity within its creator’s frameset
        
        The freedom of thinking out-of-the-box begins from the language we think in.
      - it is an inferior approach: In Synthesis, almost everything fits.
        
        It won’t always produce the expected result, but it will run and show us what it did and why.
        
        We learn more from this than from safety barriers.
        
        This is a design decision (thus a trade-off) with important productivity gains.
      - the shapes are arbitrary, they don’t correspond to anything in reality
        
        Being visual doesn’t automatically make something intuitive.
  - doesn’t generally fit in human mind and intuition
    - Although humans can and do think in visual ways, as much as in natural language.
      - Yet no visual tool to my knowledge approximates visual thinking.
        
        We still have to think in pseudocode, then translate to the visual tool.
  - doesn’t fit well in other environments
    - Synthesis lab takes advantage of Logseq’s outliner, which already adds a visual element.
      - Synthesis’ code blends with normal text (e.g. notes) like no other approach.
Much of the rest of your post is:
- wishful thinking
  - May or may not be possible in the future.
  - Synthesis is an implemented reality today (though partially).
- borderline off-topic
  - Granted that you warned about it, but you could keep it a little more focused.
  - This topic is about natural languages.
    - You could argue that either:
      - Synthesis is not as natural as the article claims
        
        You haven’t addressed the points made by the article.
      - a natural language is not the most intuitive approach
        
        You haven’t shown how visual programming is more intuitive.
    - Expressing your argument in a visual way could give your ideas some weight.
Thank you in advance for not using LLMs to write your posts.