M42 - The Orion Nebula

KingRandomGuy@lemmy.world · 8 days ago

I use a lot of AI/DL-based tools in my personal life and hobbies. As a photographer, DL-based denoising means I can get better photos, especially in low light. DL-based deconvolution tools help to sharpen my astrophotos as well. The deep learning based subject tracking on my camera also helps me get more in focus shots of wildlife. As a birder, tools like Merlin BirdID’s audio recognition and image classification methods are helpful when I encounter a bird I don’t yet know how to identify.

I don’t typically use GenAI (LLMs, diffusion models) in my personal life, but Microsoft Copilot does help me write visualization scripts for my research. I can never remember the right methods for visualization libraries in Python, and Copilot/ChatGPT do a pretty good job at that.

KingRandomGuy@lemmy.world · 8 days ago

There is no “artificial intelligence” so there are no use cases. None of the examples in this thread show any actual intelligence.

There certainly is (narrow) artificial intelligence. The examples in this thread are almost all deep learning models, which fall under ML, which in turn falls under the field of AI. They’re all artificial intelligence approaches, even if they aren’t artificial general intelligence, which more closely aligns with what a layperson thinks of when they say AI.

The problem with your characterization (showing “actual intelligence”) is that it’s super subjective. Historically, being able to play Go and to a lesser extent Chess at a professional level was considered to require intelligence. Now that algorithms can play these games, folks (even those in the field) no longer think they require intelligence and shift the goal posts. The same was said about many CV tasks like classification and segmentation until modern methods became very accurate.

KingRandomGuy@lemmy.world · 10 days ago

I work in CV and a lot of labs I’ve worked with use consumer cards for workstations. If you don’t need the full 40+GB of VRAM you save a ton of money compared to the datacenter or workstation cards. A 4090 is approximately $1600 compared to $5000+ for an equivalently performing L40 (though with half the VRAM, obviously). The x090 series cards may be overpriced for gaming but they’re actually excellent in terms of bang per buck in comparison to the alternatives for DL tasks.

AI has certainly produced revenue streams. Don’t forget AI is not just generative AI. The computer vision in high end digital cameras is all deep learning based and gets people to buy the latest cameras, for an example.

KingRandomGuy@lemmy.world · 13 days ago

GPU and overall firmware support is always better on x86 systems, so makes sense that you switched to that for your application. Performance is also usually better if you don’t explicitly need low power. In my use case I use the Orange Pi 5 Plus for running an astrophotography rig, so I needed something that was low power, could run Linux easily, had USB 3, reasonable single core performance, and preferably had the possibility of an upgradable A key WiFi card and a full speed NVMe E key slot for storage (preferably PCIe 3.0x4 or better). Having hardware serial ports was a plus too. x86 boxes would’ve been preferable but a lot of the cheaper stuff are older Intel mini PCs which have pretty poor battery life, and the newer power efficient stuff (N100 based) is more expensive and the cheaper ones I found tended to have onboard soldered WiFi cards unfortunately. Accordingly the Orange Pi 5 Plus ended up being my cheapest option that ticked all my boxes. If only software support was as good as x86!

Interesting to hear about the NPU. I work in CV and I’ve wondered how usable the NPU was. How did you integrate deep learning models with it? I presume there’s some conversion from runtime frameworks like ONNX to the NPU’s toolkit, but I’d love to learn more.

I’m also aware that Collabora has gotten the NPU drivers upstreamed, but I don’t know how NPUs are traditionally interfaced with on Linux.

KingRandomGuy@lemmy.world · 14 days ago

A lot of the cheap tablet SoC vendors like Rockchip (whose SoCs end up in low cost SBCs) really only do the bare minimum when it comes to proper linux support. There’s usually next to no effort to upstreaming their patches so oftentimes you’re stuck on their vendor kernel. Luckily for the RK3588(S), Collabora has done a considerable amount of work on supporting the SoC and its peripherals upstream. I run my Orange Pi 5 Plus (RK3588) on a mainline kernel and it works for my needs.

This practice is a lot easier to defend for a low cost SoC compared to something as expensive as a Snapdragon Elite though…

KingRandomGuy@lemmy.world · 9 months ago

Yep, and for good reason honestly. I work in CV and while I don’t work on autonomous vehicles, many of the folks I know have previously worked at companies or research institutes on these kinds of problems and all of them agree that in a scenario like this, you should treat the state of the vehicle as compromised and go into an error/shutdown mode.

Nobody wants to give their vehicle an override that can potentially harm the safety of those inside it or around it, and practically speaking there aren’t many options that guarantee safety other than this.

KingRandomGuy@lemmy.world · 9 months ago

Afaik the StarFive SOCs used in SBCs are a lot slower than current ARM offerings. Part of that might be because software support is worse, so maybe compilers and related tooling aren’t yet optimized for them?

Hopefully development on these continues to improve though. The biggest nail in the coffin for Pi alternatives has been software support.

KingRandomGuy@lemmy.world · 9 months ago

Right, as someone in the field I do try to remind people of this. AI isn’t defined as this sentient general intelligence (frankly its definition is super vague), even if that’s what people colloquially think of when they hear the term. The popular definition of AI is much closer to AGI, as you mentioned.

KingRandomGuy@lemmy.world · edit-2 11 months ago

I’m a researcher in ML and that’s not the definition that I’ve heard. Normally the way I’ve seen AI defined is any computational method with the ability to complete tasks that are thought to require intelligence.

This definition admittedly sucks. It’s very vague, and it comes with the problem that the bar for requiring intelligence shifts every time the field solves something new. We sort of go “well, given these relatively simple methods could solve it, I guess it couldn’t have really required intelligence.”

The definition you listed is generally more in line with AGI, which is what people likely think of when they hear the term AI.

KingRandomGuy@lemmy.world · 11 months ago

I believe this is the referenced article:

https://arxiv.org/abs/2311.03348

KingRandomGuy@lemmy.world · 11 months ago

I’ve been using FreeTube since Piped was very inconsistent for me, but I guess that’s just the nature of these services. I’ll have to check out Invidious again, last time I tried it was several years ago and I stopped using it after the main instance shut down. Is it still under active development? I remember its development status being unclear, partially because the language it uses is not super mainstream, but it’s probably changed since then.

KingRandomGuy@lemmy.world · 11 months ago

Fortunately, Invidious, Piped, Libretube and Newpipe all exist and work flawlessly so there’s no excuse to use proprietary trash like that.

Isn’t the very point of this post that Invidious and Piped don’t work flawlessly?

KingRandomGuy@lemmy.world · 11 months ago

Can’t you still modify and distribute Grayjay, just not commercially? I understand that still prevents the app from being considered open source, but their reasoning is valid IMO (to prevent people from making ad-infested clones on the play store, which has happened with NewPipe before).

KingRandomGuy@lemmy.world · 11 months ago

I think what they mean is that ML models generally don’t directly store their training data, but that they instead use it to form a compressed latent space. Some elements of the training data may be perfectly recoverable from the latent space, but most won’t be. It’s not very surprising as a result that you can get it to reproduce copyrighted material word for word.

KingRandomGuy@lemmy.world · 11 months ago

Not sure what other people were claiming, but normally the point being made is that it’s not possible for a network to memorize a significant portion of its training data. It can definitely memorize significant portions of individual copyrighted works (like shown here), but the whole dataset is far too large compared to the model’s weights to be memorized.

KingRandomGuy@lemmy.world · 11 months ago

Amazing work!

I was curious to know how you handle flats with this many nights of integration. Do you have separate flats for each night, or did you just take a total of 30 flats on a single night?

KingRandomGuy@lemmy.world · 1 year ago

Thank you for the feedback!

I think you may want to increase your dither size (or do them more often)

Good catch! What happened was that I forgot to tick the box to enable dithering on the first night, so the first ~180 subs or so had no dithering at all, which definitely caused the pattern noise. That said, I’d be interested to hear if you know of a rule of thumb for dithering settings (i.e. how frequently should you dither, and how many pixels). When I was dithering on the second night, I dithered every 2 frames by 2 pixels, but I more or less picked those values arbitrarily.

Also if your tracking allows for some longer subs, you could try to HDR those with your 30" exposure

Gotcha, I’ll give that a try next time. I’m shooting with an unmodified and unfiltered camera from an urban area (Bortle 7-8) so I figured longer exposures wouldn’t help too much, but it sounds like it might still be worthwhile. Thanks for the tip!

KingRandomGuy@lemmy.world · 1 year ago

M42 - The Orion Nebula

KingRandomGuy@lemmy.world · 1 year ago

The big thing you get with frameworks is super simple repairability. This means service manuals, parts availability, easy access to components like the battery, RAM, ssd, etc. Customizable ports are also a nice feature. You can even upgrade the motherboard later down the line instead of buying a whole new laptop.

KingRandomGuy@lemmy.world · 1 year ago

I haven’t read the article myself, but it’s worth noting that in CS as a whole and especially ML/CV/NLP, selective conferences are generally seen as the gold standard for publications compared to journals. The top conferences include NeurIPS, ICLR, ICML, CVPR for CV and EMNLP for NLP.

It looks like the journal in question is a physical sciences journal as well, though I haven’t looked much into it.

KingRandomGuy@lemmy.world · 1 year ago

I’m curious what field you’re in. I’m in computer vision and ML and most conferences have clauses saying not to use ChatGPT or other LLM tools. However, most of the folks I work with see no issue with using LLMs to assist in sentence structure, wording, etc, but they generally don’t approve of using LLMs to write accuracy critical sections (such as background, or results) outside of things like rewording.

I suspect part of the reason conferences are hesitant to allow LLM usage has to do with copyright, since that’s still somewhat of a gray area in the US AFAIK.

KingRandomGuy@lemmy.world · edit-2 1 year ago

M42 - The Orion Nebula

M42 - The Orion Nebula

M31 - Andromeda Galaxy

M31 - Andromeda Galaxy