• 5 Posts
  • 92 Comments
Joined 1 year ago
cake
Cake day: May 8th, 2023

help-circle

  • A1kmm@lemmy.amxl.comtoPrivacy@lemmy.ml*Permanently Deleted*
    link
    fedilink
    English
    arrow-up
    0
    ·
    21 days ago

    When people say Local AI, they mean things like the Free / Open Source Ollama (https://github.com/ollama/ollama/), which you can read the source code for and check it doesn’t have anything to phone home, and you can completely control when and if you upgrade it. If you don’t like something in the code base, you can also fork it and start your own version. The actual models (e.g. Mistral is a popular one) used with Ollama are commonly represented in GGML format, which doesn’t even carry executable code - only massive multi-dimensional arrays of numbers (tensors) that represent the parameters of the LLM.

    Now not trusting that the output is correct is reasonable. But in terms of trusting the software not to spy on you when it is FOSS, it would be no different to whether you trust other FOSS software not to spy on you (e.g. the Linux kernel, etc…). Now that is a risk to an extent if there is an xz style attack on a code base, but I don’t think the risks are materially different for ‘AI’ compared to any other software.


  • Blockchain is great for when you need global consensus on the ordering of events (e.g. Alice gave all her 5 ETH to Bob first, so a later transaction to give 5 ETH to Charlie is invalid). It is an unnecessarily expensive solution just for archival, since it necessitates storing the data on every node forever.

    Ethereum charges ‘gas’ fees per transaction which helps ensure it doesn’t collapse under the weight of excess usage. Blocks have transaction limits, and transactions have size limits. It is currently working out at about US$7,500 per MB of block data (which is stored forever, and replicated to every node in the network). The Internet Archive have apparently ~50 PB of data, which would cost US$371 trillion to put onto Ethereum (in practice, attempting this would push up the price of ETH further, and if they succeeded, most nodes would not be able to keep up with the network). Really, this is just telling us that blockchain is not appropriate for that use case, and the designers of real world blockchains have created mechanisms to make it financially unviable to attempt at that scale, because it would effectively destroy the ability to operate nodes.

    The only real reason to use an existing blockchain anyway would be on the theory that you could argue it is too big to fail due to legitimate business use cases, and too hard to remove censorship resistant data. However, if it became used in the majority for censorship resistant data sharing, and transactions were the minority, I doubt that this would stop authorities going after node operators and so on.

    The real problems that an archival project faces are:

    • The cost of storing and retrieving large amounts of data. That could be decentralised using a solution where not all data is stored on a chain - for example, IPFS.
    • The problem of curating data and deciding what is worth archiving, and what is a true-to-source archive vs fake copy. This probably requires either a centralised trusted party, or maybe a voting system.
    • The problem of censorship. Anonymity and opaqueness about what is on a particular node can help - but they might in some cases undermine the other goals of archival.

  • This is absolutely because they pulled the emergency library stunt, and they were loud as hell about it. They literally broke the law and shouted about it.

    I think that you are right as to why the publishers picked them specifically to go after in the first place. I don’t think they should have done the “emergency library”.

    That said, the publishers arguments show they have an anti-library agenda that goes beyond just the emergency library.

    Libraries are allowed to scan/digitize books they own physically. They are only allowed to lend out as many as they physically own though. Archive knew this and allowed infinite “lend outs”. They even openly acknowledged that this was against the law in their announcement post when they did this.

    The trouble is that the publishers are not just going after them for infinite lend-outs. The publishers are arguing that they shouldn’t be allowed to lend out any digital copies of a book they’ve scanned from a physical copy, even if they lock away the corresponding numbers of physical copies.

    Worse, they got a court to agree with them on that, which is where the appeal comes in.

    The publishers want it to be that physical copies can only be lent out as physical copies, and for digital copies the libraries have to purchase a subscription for a set number of library patrons and concurrent borrows, specifically for digital lending, and with a finite life. This is all about growing publisher revenue. The publishers are not stopping at saying the number of digital copies lent must be less than or equal to the number of physical copies, and are going after archive.org for their entire digital library programme.


  • A1kmm@lemmy.amxl.comtoAsklemmy@lemmy.mlAre you a 'tankie'
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 month ago

    No

    On economic policy I am quite far left - I support a low Gini coefficient, achieved through a mixed economy, but with state provided options (with no ‘think of the businesses’ pricing strategy) for the essentials and state owned options for natural monopolies / utilities / media.

    But on social policy, I support social liberties and democracy. I believe the government should intervene, with force if needed, to protect the rights of others from interference by others (including rights to bodily safety and autonomy, not to be discriminated against, the right to a clean and healthy environment, and the right not to be exploited or misled by profiteers) and to redistribute wealth from those with a surplus to those in need / to fund the legitimate functions of the state. Outside of that, people should have social and political liberties.

    I consider being a ‘tankie’ to require both the leftist aspect (✅) and the authoritarian aspect (❌), so I don’t meet the definition.


  • I looked into this previously, and found that there is a major problem for most users in the Terms of Service at https://codeium.com/terms-of-service-individual.

    Their agreement talks about “Autocomplete User Content” as meaning the context (i.e. the code you write, when you are using it to auto-complete, that the client sends to them) - so it is implied that this counts as “User Content”.

    Then they have terms saying you licence them all your user content:

    “By Posting User Content to or via the Service, you grant Exafunction a worldwide, non-exclusive, irrevocable, royalty-free, fully paid right and license (with the right to sublicense through multiple tiers) to host, store, reproduce, modify for the purpose of formatting for display and transfer User Content, as authorized in these Terms, in each instance whether now known or hereafter developed. You agree to pay all monies owing to any person or entity resulting from Posting your User Content and from Exafunction’s exercise of the license set forth in this Section.”

    So in other words, let’s say you write a 1000 line piece of software, and release it under the GPL. Then you decide to trial Codeium, and autocomplete a few tiny things, sending your 1000 lines of code as context.

    Then next week, a big corp wants to use your software in their closed source product, and don’t want to comply with the GPL. Exafunction can sell them a licence (“sublicence through multiple tiers”) to allow them to use the software you wrote without complying with the GPL. If it turns out that you used some GPLd code in your codebase (as the GPL allows), and the other developer sues Exafunction for violating the GPL, you have to pay any money owing.

    I emailed them about this back in December, and they didn’t respond or change their terms - so they are aware that their terms allow this interpretation.


  • The best option is to run them models locally. You’ll need a good enough GPU - I have an RTX 3060 with 12 GB of VRAM, which is enough to do a lot of local AI work.

    I use Ollama, and my favourite model to use with it is Mistral-7b-Instruct. It’s a 7 billion parameter model optimised for instruction following, but usable with 4 bit quantisation, so the model takes about 4 GB of storage.

    You can run it from the command line rather than a web interface - run the container for the server, and then something like docker exec -it ollama ollama run mistral, giving a command line interface. The model performs pretty well; not quite as well on some tasks as GPT-4, but also not brain-damaged from attempts to censor it.

    By default it keeps a local history, but you can turn that off.


  • Cars definitely kill wildlife too - estimation methodologies vary, but I’ve seen estimates saying:

    • Vehicles directly kill about 10,000,000 native animals across Australia per annum. That’s not including habitat loss, and doesn’t include insects (birds, reptiles, and mammals only).
    • Pet cats kill about 546,000,000 native animals across Australia per annum. I believe that’s using a similar definition excluding insects.
    • Feral cats kill about 3,000,000,000 native animals across Australia per annum.

    Of course, habit destruction and pollution has a huge impact as well.

    But roaming pet cats legitimately are a major part of the problem. It is possible to simultaneously replace lawns with tree cover, and reduce the burden of cats. That could also feed into a comprehensive policy of tackling stray and feral cat populations - something which is made harder in suburbs due to roaming pet cats.

    As for whether it is cruel: change is a stressor for cats, so a sudden change from outdoor access to indoor-only could increase stress levels, but that is a one-off transition and there could be ways to manage that (for example, by providing a lot of notice of a change and allowing owners to phase out access, or by having a permit system for indoor and outdoor cats, and allowing renewal of existing permits for specific microchipped cats, but no new outdoor cat permits). Outdoor access / hunting outdoors is a form of enrichment for cats, but not the only one possible. Indoor cats can play with toys, and have owners simulate chasing and hunting activities indoors (for example, with ribbons, small balls, chasing cat treats, and so on) to provide similar enrichment. At the same time, the indoors protect cats from stressful situations like encountering or being mauled by dogs, aggressive cats, foxes, brushtail possums, injuries on the roads, and disease.


  • True, except the difference Israel is still taking occupied land and building settlements, and excluding the people born there from them.

    The government at least needs to pick one of the two options to move forward (as well as acknowledging and making reparations for those with traditional connections to the land who were affected by past injustices):

    1. The two state solution: Palestine is a genuinely separate sovereign state, with a right to self determination, airspace, control of their territorial waters and so on. Israeli government representatives only enter Palestine on invitation from the government. Anyone born on Palestinian land, even on a former settlement, is a Palestinian unless they find another state to accept them and renounce their citizenship. Palestinians have equal protection of the law, and are expected to follow Palestinian laws on Palestinian land, or face the Palestinian justice system. If they renounce their citizenship, they are subject to Palestinian immigration law and might have to leave Palestine.
    2. The one state solution: The entire Israeli occupied ‘river to sea’ area is one state, and everyone born there is an Israeli citizen, with equal rights under the law, power to vote, etc…

    The problem is the current right-wing extremists in power in Israel do not want either solution; they want to have it both ways - when it comes to ownership and control, they want to deny the existence of a Palestinian state. But when it comes to citizenship, they want to claim everyone born on the land they occupy is not Israeli so they can deny them rights and exploit them. Their life is substantially controlled by the Israeli state, but they get no say in the leadership of the state - undermining claims it is a democracy. They don’t have equal protection under the law - Israeli authorities protect settlers taking land against people with generational connections to the land.

    None of this is new in history, as you point out. Most of the Roman Empire, most of the former British Commonwealth, etc… had similar things in the past, with massacres of the native people, lands confiscated, native people been treated as having fewer rights than the colonialists, etc…

    What is different is that those are all past atrocities (although fair reparations have still not been paid in many cases, at least further atrocities are generally not continuing to anything like the same extent), while Israel continues to commit the same atrocities to this very day.


  • The government just has to print for the money, and use it for that

    Printing money means taxing those that have cash or assets valued directly in the units of the currency being measured. Those who mostly hold other assets (say, for example, the means of production, or land / buildings, or indirect equivalents of those, such as stock) are unaffected. This makes printing money a tax that disproportionately affects the poor.

    What the government really needs to do is tax the rich. Many top one percenters of income fight that, and unfortunately despite the democratic principle of one person, one vote, in practice the one percenters find ways to capture the government in many countries (through their lobbying access, control of the media, exploitation of weaknesses of the electoral system such as non-proportional voting and gerrymandering).

    instead of bailing out the capitalists over and over.

    Bailing out large enterprises that are valuable to the public is fine, as long as the shareholders don’t get rewarded for investing in a mismanaged but ‘too big to fail’ business (i.e. they lose most of their investment), and the end result is that the public own it, and put in competent management who act in the public interest. Over time, the public could pay forward previous generations investments, and eventually the public would own a huge suite of public services.


  • Yes, but the information would need to be computationally verifiable for it to be meaningful - which basically means there is a chain of signatures and/or hashes leading back to a publicly known public key.

    One of the seminal early papers on zero-knowledge cryptography, from 2001, by Rivest, Shamir and Tauman (two of the three letters in RSA!), actually used leaking secrets as the main example of an application of Ring Signatures: https://link.springer.com/chapter/10.1007/3-540-45682-1_32. Ring Signatures work as follows: there are n RSA public keys of members of a group known to the public (or the journalist). You want to prove that you have the private key corresponding to one of the public keys, without revealing which one. So you sign a message using a ring signature over the ‘ring’ made up of the n public keys, which only requires one of n private keys. The journalist (or anyone else receiving the secret) can verify the signature, but obtain zero knowledge over which private key out of the n was used.

    However, the conditions for this might not exist. With more modern schemes, like zk-STARKs, more advanced things are possible. For example, emails these days are signed by mail servers with DKIM. Perhaps the leaker wants to prove to the journalist that they are authorised to send emails through the Boeing’s staff-only mail server, without allowing the journalist, even collaborating with Boeing, to identify which Boeing staff member did the leak. The journalist could provide the leaker with a large random number r1, and the leaker could come up with a secret large random number r2. The leaker computes a hash H(r1, r2), and encodes that hash in a pattern of space counts between full stops (e.g. “This is a sentence. I wrote this sentence.” encodes 3, 4 - the encoding would need to limit sentence sizes to allow encoding the hash while looking relatively natural), and sends a message that happens to contain that encoded hash - including to somewhere where it comes back to them. Boeing’s mail servers sign the message with DKIM - but leaking that message would obviously identify the leaker. So the leaker uses zk-STARKs to prove that there exists a message m that includes a valid DKIM signature that verifies to Boeing’s DKIM private key, and a random number r2, such that m contains the encoded form of the hash with r1 and r2. r1 or m are not revealed (that’s the zero-knowledge part). The proof might also need to prove the encoded hash occurred before “wrote:” in the body of the message to prevent an imposter tricking a real Boeing staff member including the encoded hash in a reply. Boeing and the journalist wouldn’t know r2, so would struggle to find a message with the hash (which they don’t know) in it - they might try to use statistical analysis to find messages with unusual distributions of number of spaces per sentence if the distribution forced by the encoding is too unusual.


  • While Milei doesn’t have a lot going for himself, in this case it could also be that the companies supplying the fuel have some US component / have more to lose from not having access to American markets than they gain from supplying that airline, and it is the US government to blame.

    The US blockade of Cuba is, of course, very hypocritical; there have been human rights abuses in Cuba relatively recently (e.g. the crackdown on peaceful July 11 2021 protestors), but if that is grounds for continuing sanctions of an unrelated industry for links to that country, then if there wasn’t a double standard the US should firstly be sanctioning Israel for years of brutal repression and apartheid in Israeli-occupied Palestine, and secondly be sanctioning itself for the police crackdowns on protestors calling for righting the wrongs in Palestine.





  • A1kmm@lemmy.amxl.comtoLinux@lemmy.mlopen letter to the NixOS foundation
    link
    fedilink
    English
    arrow-up
    70
    arrow-down
    21
    ·
    3 months ago

    I wonder if this is social engineering along the same vein as the xz takeover? I see a few structural similarities:

    • A lot of pressure being put on a maintainer for reasons that are not particularly obvious what they are all about to an external observer.
    • Anonymous source other than calling themselves KA - so that it can’t be linked to them as a past contributor / it is not possible to find people who actually know the instigator. In the xz case, a whole lot of anonymous personas showed up to put the maintainer under pressure.
    • A major plank of this seems to be attacking a maintainer for “Avoiding giving away authority”. In the xz attack, the attacker sought to get more access and created astroturfed pressure to achieve that ends.
    • It is on a specially allocated domain with full WHOIS privacy, hosted on GitHub on an org with hidden project owners.

    My advice to those attacked here is to keep up the good work on Nix and NixOS, and don’t give in to what could be social engineering trying to manipulate you into acting against the community’s interests.


  • Most of mine are variations of getting confused about what system / device is which:

    • Had two magnetic HDDs connected as my root partitions in RAID-1. One of the drives started getting SATA errors (couldn’t write), so I powered down and disconnected what I thought was the bad disk. Reboot, lots of errors from fsck on boot up, including lots about inodes getting connected to /lost+found. I should have realised at that point that it was a bad idea to rebuild the other good drive from that one. Instead, I ended up restoring from my (fortunately very recent!) backup.
    • I once typed sudo pm-suspend on my laptop because I had an important presentation coming up, and wanted to keep my battery charged. I later noticed my laptop was running low on power (so rushed to find power to charge it), and also that I needed a file from home I’d forgotten to grab. Turns out I was actually in a ssh terminal connected to my home computer that I’d accidentally suspended! This sort of thing is so common that there is a package in some distros (e.g. Debian) called molly-guard specifically to prevent that - I highly recommend it and install it now.
    • I also once thought I was sending a command to a local testing VM, while wiping a database directory for re-installation. Turns out, I typed it in the wrong terminal and sent it to a dev prod environment (i.e. actively used by developers as part of their daily workflow), and we had to scramble to restore it from backup, meanwhile no one could deploy anything.

  • I think the real problem is not understanding that it’s not a binary bad or good (not understanding might be understating motivations… it is difficult to get a man to understand something, when his salary depends upon his not understanding it and all that).

    Yes, realistically we are already well committed to a path that is going to cause great hardship for future generations. But it isn’t going to be an extinction level event by itself. We most definitely can still make things worse, even if we’ve already messed up rather badly.



  • A1kmm@lemmy.amxl.comtoChat@beehaw.orgHow it feels sometimes
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    I think the problem is not anonymity, it is what you might call astroturfing or, to borrow the wikipedia term, sockpuppetry.

    Pseudonymity and astroturfing are related to an extent - effective astroturfing means inflating ones own voice (and drowning out others) by interacting with lots of pseudonymous personas. It can also mean that when one pseudonymous identity of an astroturfer is identified and banned, they come back under other identities.

    Astroturfing is about manipulating people’s perception of the truth, drowning out the voices of the true majority to allow for the real people to be misled and exploited by a minority. It takes away agency to block people who are not engaging in good faith. It sucks the oxygen out of real social change.

    That said, there are also legitimate reasons for pseudonymity. Never before today has there been an age where people are tracked so pervasively, where every word is so durably stored and difficult to erase. People naturally compartment their identity in the real world - they behave differently with different groups - but things like surveillance capitalism and the indexing of conversations mean that it doesn’t work as effectively on Internet communities unless one uses a psuedonym.

    I think zero-knowledge cryptography, coupled with government-issued digital identities, could provide a middle ground in the future that allows people to compartmentalise identities, while reducing astroturfing.

    For example, imagine if I had a government issued ID number (call it x) that must never be shared with anyone except my government and me, but which will also never change even if the certificate is re-issued / renewed. And imagine I had a private key k that only I have access to (with a corresponding public key K), and cryptographic certificate C signed by the government linking K to x. Suppose I want to interact with a community that has a unique namespace identifier (e.g. a UUID) N_1. Then, using modern zero-knowledge cryptography (e.g. zk-SNARKs or zk-STARKs), I can generate a proof that for some y = H(x | N_1) (i.e. hashing, through a one-way hash, my government issued identifier with the community namespace), I know the value of a C signed by a particular government key, and the K included in the certificate, and a k that is the private key corresponding to K, and that I also have a signature D signed by K linking it to a new public key L. And since it is zero-knowledge, I can do all this without revealing the private inputs x, C, K, k or D - only the public inputs N_1, y, and L. What does that get us? It ties my new identity (backed by the public key L) to a y, and without convincing the government to change x for me, I can’t change my y. However, if I also interact on a different community with namespace N_2, I would have a different y_2, and it wouldn’t be possible to link my identities between the two communities (under this scheme, the government, who has access to the database of x values, would be able to link them, but ordinary people wouldn’t - that is necessary if you want the government to be able to re-issue in the case of lost private keys unfortunately). Some people might have multiple IDs under different governments of course, but abuse would be limited - instead of having to ban one person a thousand times / having them have a thousand identities, they might have a few if they are citizens / residents of a few countries. In practice, communities might want to rotate their namespace IDs every few months to deal with leaked credentials and to allow people to have a clean break eventually (banning a few bad actors every few months is still a lot better than if they come back multiple times a day) - and some might allow any one of several namespaces to allow people to have multiple pseudonyms up to a maximum number. Governments might also rotate x values every year to minimise the privacy impact on people who have accidentally leaked their x values.

    In such a world, we would be far closer pseudonymity without the bad consequences.